How to process multiple images with a single vision model request
LLaVA or a similar vision-capable model installed in Ollama, multiple test images available
What this does
Passes multiple image files to a vision model in a single request via the Ollama API, enabling multi-image comparison or scene analysis without separate calls. After this guide batch image processing will work in a single command.
Steps
Encode multiple images to base64. Converts each image independently.
IMG1=$(base64 -w0 /path/to/image1.jpg) IMG2=$(base64 -w0 /path/to/image2.jpg)Using
-w0prevents line wrapping, required for valid JSON.Send a multi-image request via the API. Passes both base64 strings in the
imagesarray.curl -X POST http://localhost:11434/api/generate -d "{\"model\":\"llava\",\"prompt\":\"Compare these two images. What do they have in common?\",\"images\":[\"$IMG1\",\"$IMG2\"]}"The model receives both images simultaneously and can reference them in a single response.
Ask a comparison question. Tests whether the model distinguishes between the images.
curl -s -X POST http://localhost:11434/api/generate -d "{\"model\":\"llava\",\"prompt\":\"Which image contains more people?\",\"images\":[\"$IMG1\",\"$IMG2\"]}" | jq '.response'Expected output: A response that correctly identifies each image and answers the question.
- Record the local run evidence. Save the exact command, runtime or package version, model name if applicable, and observed output so the result can be reproduced later.
Verification
curl -s -X POST http://localhost:11434/api/generate -d "{\"model\":\"llava\",\"prompt\":\"List objects visible in both images.\",\"images\":[\"$IMG1\",\"$IMG2\"]}" | jq '.response'
# Expected: Text referencing content from both provided images
Common failures
imagesfield not an array - Payload must use an array even for a single image; for multiple images, add more elements.- mismatched image order - Model processes images in array order; swap elements if behavior seems reversed.
- payload too large - Combined base64 for large images causes timeouts; compress images before encoding.
- missing
-w0in base64 - Newlines in base64 break JSON parsing; always include the flag. - connection refused - Ollama daemon not running; confirm with
ps aux | grep ollama.