RUNLOCALAIv38
→WILL IT RUNBEST GPUCOMPARETROUBLESHOOTSTARTPULSEMODELSHARDWARETOOLSBENCH
RUNLOCALAI

Operator-grade instrument for local-AI hardware intelligence. Hand-written verdicts. Real benchmarks. Reproducible commands.

OP·Fredoline Eruo
DIR
  • Models
  • Hardware
  • Tools
  • Benchmarks
  • Will it run?
GUIDES
  • Best GPU
  • Best laptop
  • Best Mac
  • Best used GPU
  • Best budget GPU
  • Best GPU for Ollama
  • Best GPU for SD
  • AI PC build $2K
  • CUDA vs ROCm
  • 16 vs 24 GB
  • Compare hardware
  • Custom compare
REF
  • Systems
  • Ecosystem maps
  • Pillar guides
  • Methodology
  • Glossary
  • Errors KB
  • Troubleshooting
  • Resources
  • Public API
EDITOR
  • About
  • About the author
  • Changelog
  • Latest
  • Updates
  • Submit benchmark
  • Send feedback
  • Trust
  • Editorial policy
  • How we make money
  • Contact
LEGAL
  • Privacy
  • Terms
  • Sitemap
MAIL · MONTHLY DIGEST
Get monthly local AI changes
Monthly recap. No spam.
DISCLOSURE

Some links on this site are affiliate links (Amazon Associates and other first-class retailers). When you buy through them, we earn a small commission at no extra cost to you. Affiliate links do not influence our verdicts — there are cards we rate highly that we don't have affiliate relationships with, and cards that sell well that we refuse to recommend. Read more →

SYS · ONLINEUPTIME · 100%2026 · operator-owned
RUNLOCALAI · v38
Families/Image Generation/Stable Diffusion
Image Generation
Mixed (open + closed variants)
Mixed (SD 1.5 OpenRAIL, SDXL OpenRAIL, SD 3.5 community + commercial)

Stable Diffusion

by Stability AI

The pioneering open-weight image-gen family. SDXL remains widely deployed; SD 3.5 Large is the architectural successor. Massive finetune ecosystem (Pony, Illustrious, NoobAI, dozens of community models).

Best entry point for local use

Start with SDXL 1.0 via ComfyUI on RTX 3060 12GB — SDXL is the most mature, best-documented, and most fine-tuned open-weight image generation model. It generates 1024×1024 images in ~6 seconds on RTX 3060 12GB and has >50,000 community LoRAs/checkpoints on CivitAI. For higher quality, SD3.5 Large uses a DiT architecture (same lineage as Flux) and generates in ~12 seconds on RTX 4090 24 GB at 1024×1024. Skip SD3 Medium — it had anatomy-quality issues at release. Skip SD1.5 — the 512×512 native resolution is obsolete, and SDXL upscales to the same output size with better quality. SDXL and SD3.5 use the Stability AI Community License — free for non-commercial and small-business use (<$1M annual revenue). For commercial use above $1M, a Stability AI Creator License is required.

Deployment guidance

For single-user generation: ComfyUI with SDXL 1.0 FP16 on RTX 3060 12GB — ~6 sec/image at 1024×1024, 20-step DPM++ 2M scheduler. Automatic1111 WebUI is the alternative with larger extension ecosystem. For SD3.5 Large: ComfyUI with the SD3 DiT node on RTX 4090 24 GB — ~12 sec/image at 1024×1024 FP16. The triple-text-encoder architecture (CLIP-L + CLIP-G + T5-XXL) requires ~15 GB just for text encoders at FP16 — use Q8 T5-XXL via bitsandbytes to fit on 24 GB. For server/batching: ComfyUI API mode with job queue — generate at queue depth. For LoRA training: Kohya SS SDXL LoRA training on RTX 3060 12GB — ~8 GB VRAM for rank-16 LoRA, ~30 min per epoch on 1K images. For SDXL Lightning/LCM/DMD distillation: these reduce steps to 4-8 for ~2 sec/image on RTX 4090 with minor quality tradeoff. See GPU buyer guide.

Recommended runtimes

Stable Diffusion WebUI (AUTOMATIC1111)ComfyUI

Related families

Flux

Related — keep moving

Compare hardware
  • RTX 3090 vs RTX 4090 (image gen) →
  • RTX 4090 vs RTX 5090 →
Buyer guides
  • Best GPU for Stable Diffusion →
  • Best GPU for local AI →
  • Best laptop for local AI →
  • Best Mac for local AI →
  • Best used GPU for local AI →
When it doesn't work
  • CUDA out of memory →
  • Ollama running slowly →
  • ROCm not detected →
  • Model keeps crashing →
Runtimes that fit
  • Stable Diffusion WebUI (AUTOMATIC1111) →
  • ComfyUI →
Alternatives
Flux
Before you buy

Verify Stable Diffusion runs on your specific hardware before committing money.

Will it run on my hardware? →Custom hardware comparison →GPU recommender (4 questions) →