RUNLOCALAIv38
->Will it run?Best GPUCompareTroubleshootStartLearnPulseModelsHardwareToolsBench
Run check
RUNLOCALAI

Independently operated catalog for local-AI hardware and software. Hand-written verdicts. Source-cited claims. Reproducible commands when we have them.

OP·Fredoline Eruo
DIR
  • Models
  • Hardware
  • Tools
  • Benchmarks
TOOLS
  • Will it run?
  • Compare hardware
  • Cost vs cloud
  • Choose my GPU
  • Prompting kits
  • Quick answers
REF
  • All buyer guides
  • Learn local AI
  • Methodology
  • Glossary
  • Errors KB
  • Trust
EDITOR
  • About
  • Author
  • How we make money
  • Editorial policy
  • Contact
LEGAL
  • Privacy
  • Terms
  • Sitemap
MAIL · MONTHLY DIGEST
Get monthly local AI changes
Monthly recap. No spam.
DISCLOSURE

Some links on this site are affiliate links (Amazon Associates and other first-class retailers). When you buy through them, we earn a small commission at no extra cost to you. Affiliate links do not influence our verdicts — there are cards we rate highly that we don't have affiliate relationships with, and cards that sell well that we refuse to recommend. Read more →

© 2026 runlocalai.coIndependently operated
RUNLOCALAI · v38
Glossary / Generative AI / ControlNet
Generative AI

ControlNet

ControlNet is a neural network architecture that adds spatial conditioning to pretrained image diffusion models (like Stable Diffusion). It takes an additional input image (e.g., a depth map, edge detection, or pose skeleton) and guides the generation to follow that structure. The operator loads a ControlNet alongside the base model; the extra input constrains where content appears. ControlNets are small enough (typically 1–2 GB at FP16) to fit alongside a 7B–13B diffusion model on a 12–24 GB GPU, though VRAM usage increases by roughly 20–30%.

Deeper dive

ControlNet works by copying the weights of a pretrained diffusion model's encoder and locking them, then training a separate 'control' network that injects conditioning features at multiple resolutions. During inference, the operator provides a conditioning image (e.g., Canny edges, depth map, OpenPose skeleton) and a prompt. The ControlNet modifies the UNet's intermediate activations so the output respects the spatial layout. Standard variants include Canny (edge-guided), depth (3D structure), normal map, and scribble. Operators often combine multiple ControlNets (e.g., depth + Canny) for finer control, though each adds VRAM overhead. In practice, ControlNet is used in Stable Diffusion workflows via ComfyUI, Automatic1111, or InvokeAI; the operator selects a preprocessor to generate the conditioning image from a source photo, then runs the combined model.

Practical example

An operator wants to generate an image of a castle that matches the layout of a photo. They load Stable Diffusion XL (SDXL) base (6.9 GB) plus a depth ControlNet (1.2 GB) on an RTX 4090 (24 GB). They run the depth preprocessor on the photo to produce a grayscale depth map, then set the ControlNet weight to 0.8. The output preserves the photo's 3D structure while the prompt 'fantasy castle, sunset' changes the style. VRAM usage peaks at ~18 GB, leaving room for a 1024×1024 image.

Workflow example

In ComfyUI, the operator loads a checkpoint (e.g., sd_xl_base_1.0.safetensors) and a ControlNet model (e.g., controlnet-depth-sdxl-1.0.safetensors). They connect a 'Load Image' node for the source photo, a 'ControlNet Preprocessor' node (set to 'Depth MiDaS'), and a 'ControlNet Apply' node that takes the base model, conditioning, and ControlNet. They set the strength to 0.9 and start queue. The runtime loads both models into VRAM; the operator monitors memory via nvidia-smi to avoid OOM.

Reviewed by Fredoline Eruo. See our editorial policy.

Buyer guides
  • Best GPU for local AI →
  • Best laptop for local AI →
  • Best Mac for local AI →
When it doesn't work
  • CUDA out of memory →
  • Ollama running slowly →
  • ROCm not detected →