DDPM (Denoising Diffusion Probabilistic Models) — AI glossary

DDPM (Denoising Diffusion Probabilistic Models) is a class of generative models that learn to generate data by reversing a gradual noising process. During training, the model learns to predict the noise added to an image at each timestep. During inference, it starts from pure random noise and iteratively denoises it over many steps (typically 50–1000) to produce a final image. The iterative process is computationally expensive, requiring multiple forward passes through the model. DDPMs are the foundation of modern diffusion models like Stable Diffusion, but they are slower than single-step models (e.g., GANs) due to the sequential denoising steps. Operators encounter DDPMs when choosing sampling methods: using fewer steps (e.g., 20) speeds up generation at the cost of quality, while more steps improve fidelity but increase latency.

Deeper dive

DDPMs work by defining a forward diffusion process that gradually adds Gaussian noise to data over T timesteps, turning it into pure noise. The model is trained to reverse this process: given a noisy image at timestep t, it predicts the noise component so that the image can be denoised back to timestep t-1. The reverse process is Markovian, meaning each step depends only on the previous one. During sampling, the model starts from random noise and applies the learned denoising step repeatedly. The number of steps T is typically 1000 during training, but during inference, operators can use fewer steps (e.g., 50) with a noise schedule (e.g., cosine or linear) to trade off speed and quality. DDPMs are closely related to score-based models and stochastic differential equations. They produce high-quality samples but are slower than GANs or flow-based models. Variants like DDIM (Denoising Diffusion Implicit Models) allow deterministic sampling with fewer steps, and latent diffusion models (e.g., Stable Diffusion) apply the process in a compressed latent space to reduce computational cost.

Practical example

When generating an image with Stable Diffusion using the DDPM sampler in ComfyUI, the operator sets the number of steps (e.g., 50) and the noise schedule (e.g., 'ddpm'). Each step requires a forward pass through the U-Net model. On an RTX 4090, 50 steps might take ~5 seconds for a 512x512 image. Using a faster sampler like DPM++ 2M Karras with 20 steps reduces time to ~2 seconds with similar quality. The operator chooses DDPM when maximum quality is needed and latency is not critical.

Workflow example

In ComfyUI or Automatic1111, the operator selects a sampler from a dropdown. Choosing 'DDPM' or 'ddpm' sets the number of steps (default 50) and the noise schedule. The workflow then runs the denoising loop: for each step, the model predicts noise, subtracts it, and adds a small amount of noise for the next step. The operator can monitor progress via the step counter and adjust step count to balance quality and speed. In Hugging Face Diffusers, the DDPM scheduler is instantiated with DDPMScheduler(num_train_timesteps=1000) and used in a pipeline with scheduler.set_timesteps(num_inference_steps=50).