RUNLOCALAIv38
->Will it run?Best GPUCompareTroubleshootStartLearnPulseModelsHardwareToolsBench
Run check
RUNLOCALAI

Independently operated catalog for local-AI hardware and software. Hand-written verdicts. Source-cited claims. Reproducible commands when we have them.

OP·Fredoline Eruo
DIR
  • Models
  • Hardware
  • Tools
  • Benchmarks
TOOLS
  • Will it run?
  • Compare hardware
  • Cost vs cloud
  • Choose my GPU
  • Prompting kits
  • Quick answers
REF
  • All buyer guides
  • Learn local AI
  • Methodology
  • Glossary
  • Errors KB
  • Trust
EDITOR
  • About
  • Author
  • How we make money
  • Editorial policy
  • Contact
LEGAL
  • Privacy
  • Terms
  • Sitemap
MAIL · MONTHLY DIGEST
Get monthly local AI changes
Monthly recap. No spam.
DISCLOSURE

Some links on this site are affiliate links (Amazon Associates and other first-class retailers). When you buy through them, we earn a small commission at no extra cost to you. Affiliate links do not influence our verdicts — there are cards we rate highly that we don't have affiliate relationships with, and cards that sell well that we refuse to recommend. Read more →

© 2026 runlocalai.coIndependently operated
RUNLOCALAI · v38
Glossary / Generative AI / DreamBooth
Generative AI

DreamBooth

DreamBooth is a fine-tuning technique that personalizes a text-to-image model (like Stable Diffusion) to generate images of a specific subject (e.g., a person, pet, or object) in various contexts. It works by training the model on a small set of input images (typically 3–5) of the subject, paired with a unique identifier token (e.g., "sks dog"), while using a prior-preservation loss to prevent overfitting and catastrophic forgetting. The result is a custom checkpoint or LoRA adapter that can be loaded into image-generation software to produce novel scenes featuring the subject. For operators, DreamBooth requires significant VRAM (12–24 GB for full fine-tuning) and time (30–60 minutes on a consumer GPU), though LoRA-based variants reduce both.

Deeper dive

DreamBooth, introduced by Google Research in 2022, extends diffusion models by embedding a new concept into the model's latent space. The process involves: (1) collecting 3–5 images of the subject from different angles/backgrounds, (2) assigning a rare token (e.g., "sks") as a placeholder, (3) fine-tuning the UNet and text encoder on those images with a prior-preservation term that uses the base model's own generated samples to retain general knowledge. The output is a fine-tuned checkpoint (typically 2 GB for Stable Diffusion 1.5) or, more commonly, a LoRA adapter (100 MB) that can be merged at inference time. Operator-relevant variants include: full DreamBooth (high VRAM, high quality), DreamBooth + LoRA (lower VRAM, faster training), and text-inversion (no model weights, only embedding vectors). Tools like Kohya_ss, EveryDream2, and Hugging Face's diffusers library provide scripts for training. On a 24 GB GPU (RTX 3090), full fine-tuning takes ~45 minutes at 512x512 resolution; LoRA training on a 12 GB card takes ~15 minutes. Inference requires loading the custom checkpoint or LoRA into software like Automatic1111, ComfyUI, or InvokeAI.

Practical example

An operator wants to generate images of their cat "Mittens" in various styles. They take 5 photos of Mittens from different angles, then use Kohya_ss to train a LoRA with the token "mttns cat" on a 12 GB RTX 3060. Training takes 20 minutes at 512x512, outputting a 144 MB LoRA file. They load this LoRA into Automatic1111 alongside Stable Diffusion XL, and prompt "a portrait of mttns cat wearing a wizard hat, oil painting" — the model generates Mittens in that style, preserving fur color and face shape.

Workflow example

In a typical workflow, the operator first prepares a dataset of 3–5 subject images, resized to 512x512. They then run a DreamBooth training script (e.g., accelerate launch train_dreambooth.py --pretrained_model_name_or_path=runwayml/stable-diffusion-v1-5 --instance_data_dir=./mittens --instance_prompt="a photo of sks cat" --class_prompt="a photo of a cat" --resolution=512 --train_batch_size=1 --gradient_accumulation_steps=1 --learning_rate=5e-6 --lr_scheduler=constant --lr_warmup_steps=0 --max_train_steps=800). After training, the output checkpoint is loaded in LM Studio or Automatic1111 by pointing the model path to the new folder. Inference prompts use the unique token (e.g., "sks cat") to trigger the personalized concept.

Reviewed by Fredoline Eruo. See our editorial policy.

Buyer guides
  • Best GPU for local AI →
  • Best laptop for local AI →
  • Best Mac for local AI →
When it doesn't work
  • CUDA out of memory →
  • Ollama running slowly →
  • ROCm not detected →