Ollama: model requires more system memory than is available
Cause
Different from VRAM OOM — this is system RAM. Ollama needs to load the model file into RAM before transferring to VRAM (mmap'd or copied). On systems with low RAM and large models, the load step fails before the GPU is even involved.
Solution
Check current system RAM usage:
# macOS / Linux
free -h
# Windows PowerShell
Get-Counter '\Memory\Available MBytes'
Free system RAM by closing other apps (browsers, IDEs, Slack — these can chew 4-8 GB easily).
Use a smaller quantization so the file is smaller on disk and in RAM:
# Q4 instead of Q8 — half the file size
ollama pull llama3.1:8b-instruct-q4_K_M
Add swap (Linux):
sudo fallocate -l 16G /swapfile
sudo chmod 600 /swapfile
sudo mkswap /swapfile
sudo swapon /swapfile
Loading is slower but possible.
Get more RAM. For local AI work, 32 GB RAM is the practical floor; 64 GB or 128 GB unlocks larger models with CPU offload.
Related errors
Did this fix it?
If your case was different, email hello@runlocalai.co with what you saw and we'll update the page. If it worked but took different commands on your platform, we want to know that too.