RUNLOCALAIv38
->Will it run?Best GPUCompareTroubleshootStartLearnPulseModelsHardwareToolsBench
Run check
RUNLOCALAI

Independently operated catalog for local-AI hardware and software. Hand-written verdicts. Source-cited claims. Reproducible commands when we have them.

OP·Fredoline Eruo
DIR
  • Models
  • Hardware
  • Tools
  • Benchmarks
TOOLS
  • Will it run?
  • Compare hardware
  • Cost vs cloud
  • Choose my GPU
  • Prompting kits
  • Quick answers
REF
  • All buyer guides
  • Learn local AI
  • Methodology
  • Glossary
  • Errors KB
  • Trust
EDITOR
  • About
  • Author
  • How we make money
  • Editorial policy
  • Contact
LEGAL
  • Privacy
  • Terms
  • Sitemap
MAIL · MONTHLY DIGEST
Get monthly local AI changes
Monthly recap. No spam.
DISCLOSURE

Some links on this site are affiliate links (Amazon Associates and other first-class retailers). When you buy through them, we earn a small commission at no extra cost to you. Affiliate links do not influence our verdicts — there are cards we rate highly that we don't have affiliate relationships with, and cards that sell well that we refuse to recommend. Read more →

© 2026 runlocalai.coIndependently operated
RUNLOCALAI · v38
  1. >
  2. Home
  3. /Learn
  4. /How-to
  5. /How to process multiple images with a single vision model request
HOW-TO · INF

How to process multiple images with a single vision model request

intermediate·15 min·By Fredoline Eruo
Target environment
Ubuntu 24.04 · Ollama 0.4.x
PREREQUISITES

LLaVA or a similar vision-capable model installed in Ollama, multiple test images available

What this does

Passes multiple image files to a vision model in a single request via the Ollama API, enabling multi-image comparison or scene analysis without separate calls. After this guide batch image processing will work in a single command.

Steps

  1. Encode multiple images to base64. Converts each image independently.

    IMG1=$(base64 -w0 /path/to/image1.jpg)
    IMG2=$(base64 -w0 /path/to/image2.jpg)
    

    Using -w0 prevents line wrapping, required for valid JSON.

  2. Send a multi-image request via the API. Passes both base64 strings in the images array.

    curl -X POST http://localhost:11434/api/generate -d "{\"model\":\"llava\",\"prompt\":\"Compare these two images. What do they have in common?\",\"images\":[\"$IMG1\",\"$IMG2\"]}"
    

    The model receives both images simultaneously and can reference them in a single response.

  3. Ask a comparison question. Tests whether the model distinguishes between the images.

    curl -s -X POST http://localhost:11434/api/generate -d "{\"model\":\"llava\",\"prompt\":\"Which image contains more people?\",\"images\":[\"$IMG1\",\"$IMG2\"]}" | jq '.response'
    

    Expected output: A response that correctly identifies each image and answers the question.

  • Record the local run evidence. Save the exact command, runtime or package version, model name if applicable, and observed output so the result can be reproduced later.

Verification

curl -s -X POST http://localhost:11434/api/generate -d "{\"model\":\"llava\",\"prompt\":\"List objects visible in both images.\",\"images\":[\"$IMG1\",\"$IMG2\"]}" | jq '.response'
# Expected: Text referencing content from both provided images

Common failures

  • images field not an array - Payload must use an array even for a single image; for multiple images, add more elements.
  • mismatched image order - Model processes images in array order; swap elements if behavior seems reversed.
  • payload too large - Combined base64 for large images causes timeouts; compress images before encoding.
  • missing -w0 in base64 - Newlines in base64 break JSON parsing; always include the flag.
  • connection refused - Ollama daemon not running; confirm with ps aux | grep ollama.

Related guides

  • How to format image prompts correctly for LLaVA models
  • How to run LLaVA vision model to analyze images
RELATED GUIDES
INF
How to run LLaVA vision model to analyze images
INF
How to format image prompts correctly for LLaVA models
← All how-to guidesCourses →