RUNLOCALAIv38
->Will it run?Best GPUCompareTroubleshootStartLearnPulseModelsHardwareToolsBench
Run check
RUNLOCALAI

Independently operated catalog for local-AI hardware and software. Hand-written verdicts. Source-cited claims. Reproducible commands when we have them.

OP·Fredoline Eruo
DIR
  • Models
  • Hardware
  • Tools
  • Benchmarks
TOOLS
  • Will it run?
  • Compare hardware
  • Cost vs cloud
  • Choose my GPU
  • Prompting kits
  • Quick answers
REF
  • All buyer guides
  • Learn local AI
  • Methodology
  • Glossary
  • Errors KB
  • Trust
EDITOR
  • About
  • Author
  • How we make money
  • Editorial policy
  • Contact
LEGAL
  • Privacy
  • Terms
  • Sitemap
MAIL · MONTHLY DIGEST
Get monthly local AI changes
Monthly recap. No spam.
DISCLOSURE

Some links on this site are affiliate links (Amazon Associates and other first-class retailers). When you buy through them, we earn a small commission at no extra cost to you. Affiliate links do not influence our verdicts — there are cards we rate highly that we don't have affiliate relationships with, and cards that sell well that we refuse to recommend. Read more →

© 2026 runlocalai.coIndependently operated
RUNLOCALAI · v38
  1. >
  2. Home
  3. /Learn
  4. /How-to
  5. /How to handle high-resolution images with LLaVA
HOW-TO · INF

How to handle high-resolution images with LLaVA

intermediate·15 min·By Fredoline Eruo
Target environment
Ubuntu 24.04 · Ollama 0.4.x
PREREQUISITES

LLaVA installed in Ollama, high-resolution test images, image processing tools (ImageMagick or Python PIL) optionally available

What this does

Provides strategies for processing images that exceed LLaVA default resolution handling, including resizing, tiling, and context window adjustment. After this guide high-resolution images will be analyzable without memory errors.

Steps

  1. Check current image dimensions. Inspects pixel size before deciding on the approach.

    identify /path/to/highres.jpg
    

    Requires ImageMagick. Alternatively, use python3 -c "from PIL import Image; print(Image.open('/path/to/highres.jpg').size)".

  2. Resize large images to a manageable dimension. LLaVA typically processes well at 1024px on the longest edge.

    convert /path/to/highres.jpg -resize 1024x1024 /tmp/resized.jpg
    

    Resizing preserves most visual information while reducing memory footprint significantly.

  3. Increase context window for larger images. Allocates additional tokens for image encoding.

    ollama run llava "Describe this image in detail." /tmp/resized.jpg --num-ctx 4096
    

    A larger context window prevents truncation of the image encoding tokens.

  4. Use image tiling for extremely large images. Splits into quadrants and processes each separately.

    convert /path/to/highres.jpg -crop 2048x2048+0+0 /tmp/tile_1.jpg
    convert /path/to/highres.jpg -crop 2048x2048+2048+0 /tmp/tile_2.jpg
    

Verification

identify /tmp/resized.jpg && ollama run llava "List three details visible in this image." /tmp/resized.jpg --num-ctx 4096
# Expected: Dimensions smaller than original, followed by response referencing specific visual details

Common failures

  • num_ctx too small - Image encoding plus prompt exceeds context; increase to 4096 or higher.
  • resize artifacts - Aggressive downscaling loses details; test multiple resize dimensions (768, 1024, 1280).
  • tiling misalignment - Tiles may miss content at seams; use overlapping crops for critical analyses.
  • OOM on tile processing - Process tiles sequentially rather than all at once.
  • convert command not found - Install ImageMagick or use Python PIL instead.

Operator checkpoint

Before treating this as solved, write down the local runtime, model or package version, hardware/backend if relevant, and the verification output. This keeps the guide useful as a Will-It-Run style decision instead of a one-off command transcript.

Related guides

  • How to run LLaVA vision model to analyze images
  • How to compare vision model outputs across different model sizes
RELATED GUIDES
INF
How to run LLaVA vision model to analyze images
INF
How to compare vision model outputs across different model sizes
← All how-to guidesCourses →