Notable models & companies

Stability AI

Stability AI is the company behind the Stable Diffusion family of image generation models, which operators run locally via tools like ComfyUI or Automatic1111. These models are open-weight (most under a non-commercial license) and typically require 4–12 GB of VRAM depending on resolution and model size. Operators encounter Stability AI when downloading Stable Diffusion checkpoints from Hugging Face or CivitAI, or when using the Stable Diffusion backend in local image-generation workflows.

Deeper dive

Stability AI released Stable Diffusion 1.4 in August 2022, followed by SD 1.5, SD 2.x, SDXL, SDXL Turbo, and Stable Diffusion 3. Each iteration improved image quality, prompt adherence, and generation speed. SD 1.5 (1.98B parameters) runs on 4–6 GB VRAM at 512×512; SDXL (2.6B parameters) requires 8–12 GB for 1024×1024. The models use a latent diffusion architecture: a VAE compresses images into a latent space, a U-Net denoises latents conditioned on text embeddings from CLIP, and the VAE decodes the result. Operators often fine-tune these models with LoRA or DreamBooth, or merge checkpoints for custom styles. Stability AI also released StableLM for text generation and Stable Audio for audio, but the image models remain the most widely used in local setups.

Practical example

An operator with an RTX 3060 12 GB can run SDXL at 1024×1024 using ComfyUI, achieving ~2–3 it/s with a standard workflow. Switching to SD 1.5 at 512×512 yields ~8–10 it/s. If VRAM is tight (e.g., 8 GB), SDXL may require --medvram flag in Automatic1111 or using a quantized VAE to avoid out-of-memory errors.

Workflow example

In ComfyUI, an operator loads a Stable Diffusion checkpoint (e.g., sd_xl_base_1.0.safetensors) via the Load Checkpoint node. The workflow connects a CLIP text encoder, a KSampler (steps=20, CFG=7), and a VAE decoder. Running the queue generates a 1024×1024 image in ~30 seconds on a 12 GB card. If the operator switches to SD 1.5, they must change the checkpoint and adjust the latent resolution to 512×512.

Reviewed by Fredoline Eruo. See our editorial policy.

Buyer guides

When it doesn't work