Image Generation
image extension
uncrop

Outpainting

Extending images beyond their original borders. Useful for aspect ratio changes, scene expansion, panoramic creation.

Setup walkthrough

  1. Install ComfyUI via Stability Matrix.
  2. ComfyUI Manager → Install Models → search "flux1-fill-dev" (23 GB) or "stable-diffusion-2-inpainting" (5 GB) for lighter weight.
  3. Load an outpainting workflow (search ComfyUI workflow library for "outpaint"). The workflow:
    • Load Image → Pad Image for Outpainting (adds empty canvas in chosen direction)
    • Connect to Flux Fill / SD Inpaint model
    • Prompt: "extend the image naturally" or specific: "continue the beach scene to the left"
    • Queue
  4. First outpainted image in 10-30 seconds. The model fills the padded region while matching the original's style, lighting, and content.
  5. Use cases: change aspect ratio (square→16:9), expand product photos for banners, create panoramas from single shots, extend backgrounds for design.
  6. For character outpainting: use ControlNet + IP-Adapter alongside the fill model to maintain character consistency in the extended region.

The cheap setup

Used RTX 3060 12 GB ($200-250, see /hardware/rtx-3060-12gb). Runs SD 2.0 inpainting in outpainting mode at 10-20 seconds per edge extension. For a 1024×1024 → 1920×1024 panorama (adding 448px to each side), expect 20-40 seconds total. Flux Fill at FP8 (12 GB) at 25-45 seconds for the same operation but higher quality. Pair with Ryzen 5 5600 + 32 GB DDR4 + 1TB NVMe. Total: ~$390-440. Outpainting is slightly heavier than inpainting because the filled area is often larger.

The serious setup

Used RTX 3090 24 GB ($700-900, see /hardware/rtx-3090). Runs Flux Fill Dev FP16 for outpainting at 10-20 seconds per edge extension — the quality difference vs. SD is significant for complex scenes (architecture, landscapes). For production graphic design workflows (daily outpainting for social media banners, print layouts), the speed is acceptable. Total: ~$1,800-2,200. RTX 4090 ($1,600) drops outpainting to 4-8 seconds — viable for interactive "expand canvas" in photo editors.

Common beginner mistake

The mistake: Outpainting without providing a text prompt, relying entirely on "context-aware fill," then getting blurry or nonsensical extended regions. Why it fails: Without a prompt, the model tries to extend the image based purely on visual patterns — it tends to repeat nearby textures or generate generic blur. The model doesn't know what SHOULD be there. The fix: Always provide a prompt describing what's in the extended region. "Extend the forest scene to the left, continuing the path and adding more pine trees" works 10× better than prompt="". For better results, outpaint in smaller increments (100-200px at a time) rather than trying to fill 1000px in one go. Each small extension gives the model the previous extension as context, building a coherent scene iteratively.

Reality check

Image gen is compute-bound, not bandwidth-bound. VRAM matters for the resolution + LoRA training stack, but FP16 TFLOPS is what decides Flux throughput. The 5080's compute advantage over 5070 Ti shows here in ways it doesn't on LLM inference.

Common mistakes

  • Buying for VRAM ceiling without checking compute (16 GB Flux Dev FP16 doesn't fit anyway)
  • Skipping LoRA training requirements (24 GB minimum, 32 GB comfortable for Flux)
  • Underestimating ComfyUI's multi-model VRAM appetite vs A1111's single-pipeline
  • Using Q4 quantized image models — quality drop is more visible than on LLMs

What breaks first

The errors most operators hit when running outpainting locally. Each links to a diagnose+fix walkthrough.

Before you buy

Verify your specific hardware can handle outpainting before committing money.

Related tasks

Specialized buyer guides
Updated 2026 roundup