RUNLOCALAIv38
->Will it run?Best GPUCompareTroubleshootStartLearnPulseModelsHardwareToolsBench
Run check
RUNLOCALAI

Independently operated catalog for local-AI hardware and software. Hand-written verdicts. Source-cited claims. Reproducible commands when we have them.

OP·Fredoline Eruo
DIR
  • Models
  • Hardware
  • Tools
  • Benchmarks
TOOLS
  • Will it run?
  • Compare hardware
  • Cost vs cloud
  • Choose my GPU
  • Prompting kits
  • Quick answers
REF
  • All buyer guides
  • Learn local AI
  • Methodology
  • Glossary
  • Errors KB
  • Trust
EDITOR
  • About
  • Author
  • How we make money
  • Editorial policy
  • Contact
LEGAL
  • Privacy
  • Terms
  • Sitemap
MAIL · MONTHLY DIGEST
Get monthly local AI changes
Monthly recap. No spam.
DISCLOSURE

Some links on this site are affiliate links (Amazon Associates and other first-class retailers). When you buy through them, we earn a small commission at no extra cost to you. Affiliate links do not influence our verdicts — there are cards we rate highly that we don't have affiliate relationships with, and cards that sell well that we refuse to recommend. Read more →

© 2026 runlocalai.coIndependently operated
RUNLOCALAI · v38
Glossary / Generative AI / Latent Diffusion
Generative AI

Latent Diffusion

Latent diffusion is a technique used in image generation models (like Stable Diffusion) that applies the diffusion process in a compressed, lower-dimensional latent space rather than directly in pixel space. The model first encodes an image into a latent representation using a pretrained autoencoder, then gradually adds and removes noise in that latent space. This dramatically reduces computational cost and memory requirements, making it feasible to run on consumer GPUs. Operators encounter latent diffusion in tools like Stable Diffusion WebUI or ComfyUI, where the VAE (Variational Autoencoder) handles the encode/decode steps, and the UNet denoises the latent.

Deeper dive

Standard diffusion models operate directly on high-resolution pixel grids (e.g., 512x512x3 = ~786k dimensions), which is computationally prohibitive. Latent diffusion compresses the image into a smaller latent space (e.g., 64x64x4 = ~16k dimensions) using a pretrained VAE encoder. The diffusion process—forward noise addition and reverse denoising—then runs in this latent space, reducing memory and compute by orders of magnitude. After denoising, the VAE decoder reconstructs the final image. This design is why Stable Diffusion can run on GPUs with as little as 4 GB VRAM (at reduced resolution or with optimizations). The trade-off is that the VAE introduces slight compression artifacts, and the latent space's structure can affect output quality. Variants like Stable Diffusion XL use larger latent spaces for finer detail.

Practical example

A 512x512 RGB image has 786,432 pixel values. After VAE encoding, the latent representation is 64x64x4 = 16,384 values—a 48x reduction. This means the UNet denoising step operates on 16k dimensions instead of 786k, fitting into ~4 GB VRAM for a 512x512 generation. On an RTX 3060 12 GB, generating a 512x512 image with Stable Diffusion 1.5 takes ~2-3 seconds; without latent compression, the same task would require >48 GB VRAM and be impractical.

Workflow example

In Stable Diffusion WebUI, the operator selects a checkpoint (e.g., v1-5-pruned-emaonly.safetensors) and sets the VAE to 'automatic' or a specific vae-ft-mse-840000-ema-pruned. When generating, the workflow: (1) VAE encoder compresses the initial noise into latent space, (2) UNet denoises the latent over ~20-50 steps (controlled by sampler settings), (3) VAE decoder reconstructs the final image. The operator sees 'VAE loading' in the console and can monitor VRAM usage—latent diffusion keeps VRAM under ~6 GB for 512x512, enabling batch generation on mid-range GPUs.

Reviewed by Fredoline Eruo. See our editorial policy.

Buyer guides
  • Best GPU for local AI →
  • Best laptop for local AI →
  • Best Mac for local AI →
When it doesn't work
  • CUDA out of memory →
  • Ollama running slowly →
  • ROCm not detected →