fatalEditorialReviewed May 2026

Ollama model not found — fix the pull / registry / namespace error

Ollama 'model not found' errors trace to typos in the model name, pulling a model that doesn't exist in the official registry, network blocks on the registry, or pulling from a custom registry without auth.

OllamaOllama Hub registrycustom Ollama registries

By Fredoline Eruo · Last verified 2026-05-08

Diagnostic order — most likely first

Model name typo or wrong tag

Diagnose

`ollama pull llama3.1` works but `ollama pull llama3.1:70b-q4_K_M` fails with 'manifest unknown.' Tag doesn't exist on the registry.

Fix

Browse the official model page on ollama.com to find valid tags. Common pattern: `<model>:<size>` or `<model>:<size>-<quant>`. Example: `ollama pull llama3.1:70b` (uses default quant) or `ollama pull llama3.1:70b-q4_K_M`.

Model genuinely doesn't exist in Ollama's registry

Diagnose

Some HuggingFace models aren't mirrored to ollama.com. `ollama pull <random-hf-model>` fails because Ollama only pulls from its registry by default.

Fix

Either find a similar model on ollama.com, or import from HuggingFace via Modelfile: `FROM <gguf-url-or-local-path>`. Then `ollama create my-model -f Modelfile`.

Network blocks on Ollama's registry

Diagnose

`curl https://registry.ollama.ai` fails or times out. Corporate firewall, VPN, or country block.

Fix

Test: `curl -v https://registry.ollama.ai/v2/`. If blocked, configure proxy: `export HTTPS_PROXY=http://proxy.host:port` before `ollama pull`. Or download GGUF manually + import via Modelfile.

Custom Ollama registry without auth

Diagnose

Trying `ollama pull mycorp.com/internal-model` fails because credentials aren't configured.

Fix

Set OLLAMA_HOST or use Modelfile import. Ollama's auth for custom registries is limited; the GGUF + Modelfile path is more reliable.

Disk full preventing the pull from completing

Diagnose

Pull starts then fails partway through. Check `df -h ~/.ollama/models` (or wherever Ollama stores models). 70B Q4 needs ~40 GB free.

Fix

Free disk space. Move Ollama models dir to larger drive: `OLLAMA_MODELS=/path/to/larger/disk ollama serve`. Or remove unused models: `ollama rm <model>`.

GGUF file imported via Modelfile but the tokenizer can't be detected

Diagnose

`ollama create mymodel -f Modelfile` succeeds but `ollama run mymodel` errors with 'model not found' or a tokenizer fallback warning. The Modelfile FROM path points to a valid GGUF, but Ollama can't introspect the built-in tokenizer metadata.

Fix

Verify the GGUF's metadata: `./llama-gguf <file>.gguf` prints the tokenizer info, chat template, BOS/EOS tokens. If 'tokenizer.ggml.model' is missing from the output, the GGUF was converted without tokenizer metadata. Re-download from a reputable source (bartowski, lmstudio-community on HuggingFace) whose GGUFs consistently include tokenizer configs.

GGUF filename implies a different architecture than Ollama expects

Diagnose

Ollama auto-detects model architecture from the GGUF metadata, but some conversion tools embed generic or wrong architecture strings. `ollama serve` logs show 'unknown architecture: llama' for a model that's actually a Mistral derivative.

Fix

Explicitly set the architecture in your Modelfile: `FROM ./model.gguf` followed by `PARAMETER stop "<|im_end|>"` and `TEMPLATE """{{ .Prompt }}"""` if the GGUF lacks a chat template. If still failing, convert fresh from the HuggingFace safetensors source using the current `convert-hf-to-gguf.py` script from llama.cpp HEAD.

Ollama's internal model registry has a different quant than the HuggingFace GGUF you downloaded

Diagnose

You downloaded a GGUF from HuggingFace and imported it. The model works in llama.cpp but Ollama refuses it. Ollama has stricter expectations about GGUF version numbers and metadata fields than llama.cpp.

Fix

Use the Modelfile's `FROM` directive with the full local path: `FROM /absolute/path/to/model.gguf`. If Ollama still rejects, re-download from Ollama's official registry — the library models are tested against Ollama's parser. HuggingFace GGUFs work in llama.cpp but Ollama adds a metadata validation step that some converters miss.

Frequently asked questions

How do I find the right Ollama model name + tag?

Browse ollama.com/library for the official catalog. Each model page lists all available tags + sizes. Default tag is usually the recommended quant for most users.

Can I run any HuggingFace GGUF model in Ollama?

Yes via Modelfile import: `FROM /path/to/model.gguf` + `ollama create mymodel -f Modelfile`. Ollama doesn't auto-pull HuggingFace; you handle the download yourself.

Why is Ollama's registry separate from HuggingFace?

Ollama pre-packages models with default chat templates + system prompts + recommended quants. HuggingFace ships raw weights. Ollama's registry is curated for plug-and-play; HuggingFace is the broader source.

How do I translate a HuggingFace model name into an Ollama pull command?

There's no automatic mapping. Ollama's registry is curated — not every HuggingFace model is mirrored. Search ollama.com/library for the base model name (e.g., 'llama3.1' covers Llama 3.1 and its finetunes). For models not in the registry, use the Modelfile import path: download the GGUF from HuggingFace and run `ollama create mymodel -f Modelfile`.

Why does 'ollama pull mistral' work but 'ollama pull mistral-7b-instruct-v0.3' doesn't?

Ollama's tag granularity varies by model. Some models expose fine-grained tags (`llama3.1:70b-q4_K_M`), others only ship one or two quants under the base name. Check the model's page on ollama.com/library — the 'Tags' tab lists all available variants. If your exact quant isn't listed, pull the closest tag or import a custom GGUF via Modelfile.

What's the Modelfile syntax for importing a GGUF with a custom chat template?

Minimal Modelfile for a custom GGUF: ``` FROM ./llama-3.1-8b-q4_k_m.gguf TEMPLATE """<|start_header_id|>system<|end_header_id|> {{ .System }}<|eot_id|><|start_header_id|>user<|end_header_id|> {{ .Prompt }}<|eot_id|><|start_header_id|>assistant<|end_header_id|> """ PARAMETER stop "<|start_header_id|>" PARAMETER stop "<|end_header_id|>" PARAMETER stop "<|eot_id|>" ``` Get the template + stop tokens from the model's HuggingFace page (usually in `tokenizer_config.json`). Then `ollama create mymodel -f Modelfile`.

Related troubleshooting

Ollama is slow / running on CPU instead of GPU

Ollama silently falls back to CPU when it can't load a model into VRAM. Here's how to confirm the fallback, force GPU usage, and pick a model that actually fits.

Ollama: 'address already in use' / port 11434 conflict

Ollama defaults to port 11434. When something else is on that port — often a previous Ollama process, Docker container, or another LLM server — startup fails. Here's how to find the squatter and reclaim the port.

HuggingFace download failed / 401 / rate-limit / network error

HuggingFace download errors split into auth (gated model, no token), rate-limit (anonymous traffic capped), or network (corporate proxy, country block). Diagnose by HTTP status code, fix per cause.

When the fix is hardware

A surprising fraction of troubleshooting tickets resolve to: this card doesn't have enough VRAM for what you're asking it to do. If you're hitting OOM after every reasonable fix, or your GPU genuinely can't fit the model you need, it's upgrade time:

Where next?

All troubleshooting guides

OrBest GPU for local AI Will it run on my hardware?