What causes "llama.cpp: error loading model — bad magic / unsupported GGUF"?

**Environment:** [llama.cpp](/tools/llama-cpp) and downstream tools ([Ollama](/tools/ollama), [LM Studio](/tools/lm-studio), [koboldcpp](/tools/koboldcpp)) loading a GGUF file built by an incompatible version. **Severity: medium** — file is unusable until re-quantized. - GGUF v1/v2 file loaded by a llama.cpp build that only supports v3+ - Newer GGUF v3 file with new metadata loaded by old llama.cpp - File is actually a different format renamed to .gguf (e.g. AWQ safetensors) - Partial download — first bytes are HTML from a CDN error page - Custom-quantized file with non-standard tensor types your build wasn't compiled with

Model format / GGUF

Verified by owner

llama.cpp: error loading model — bad magic / unsupported GGUF

Q: How do you fix "llama.cpp: error loading model — bad magic / unsupported GGUF"?

**1. Verify the file isn't corrupted** — first 4 bytes should be `GGUF`: ```bash xxd model.gguf | head -1 # 00000000: 4747 5546 ... (i.e. "GGUF") ``` If you see HTML or zeros, redownload. **2. Update llama.cpp** — most "bad magic" errors are stale builds: ```bash cd llama.cpp git pull make clean GGML_CUDA=1 make -j ``` **3. Re-quantize from the original safetensors** with a current llama-quantize: ```bash # Convert HF model to GGUF F16 python convert_hf_to_gguf.py /path/to/model --outfile model-f16.gguf # Quantize to your target ./llama-quantize model-f16.gguf model-Q4_K_M.gguf Q4_K_M ``` **4. Pull a known-good GGUF** from a maintained repo: ```bash hf download bartowski/Meta-Llama-3.1-8B-Instruct-GGUF \ Meta-Llama-3.1-8B-Instruct-Q4_K_M.gguf ``` **5. If the file came from a vendor build**, check their llama.cpp commit pin in the model card and match it: ```bash git checkout make clean && make -j ```

llama_model_load: error loading model: failed to load model 'X': bad magic / unsupported GGUF version

By Fredoline Eruo · Last verified Jun 12, 2026

Cause

Environment: llama.cpp and downstream tools (Ollama, LM Studio, koboldcpp) loading a GGUF file built by an incompatible version.

Severity: medium — file is unusable until re-quantized.

GGUF v1/v2 file loaded by a llama.cpp build that only supports v3+
Newer GGUF v3 file with new metadata loaded by old llama.cpp
File is actually a different format renamed to .gguf (e.g. AWQ safetensors)
Partial download — first bytes are HTML from a CDN error page
Custom-quantized file with non-standard tensor types your build wasn't compiled with

Solution

1. Verify the file isn't corrupted — first 4 bytes should be GGUF:

xxd model.gguf | head -1
# 00000000: 4747 5546 ...   (i.e. "GGUF")

If you see HTML or zeros, redownload.

2. Update llama.cpp — most "bad magic" errors are stale builds:

cd llama.cpp
git pull
make clean
GGML_CUDA=1 make -j

3. Re-quantize from the original safetensors with a current llama-quantize:

# Convert HF model to GGUF F16
python convert_hf_to_gguf.py /path/to/model --outfile model-f16.gguf
# Quantize to your target
./llama-quantize model-f16.gguf model-Q4_K_M.gguf Q4_K_M

4. Pull a known-good GGUF from a maintained repo:

hf download bartowski/Meta-Llama-3.1-8B-Instruct-GGUF \
  Meta-Llama-3.1-8B-Instruct-Q4_K_M.gguf

5. If the file came from a vendor build, check their llama.cpp commit pin in the model card and match it:

git checkout <commit-sha>
make clean && make -j

Related errors

Did this fix it?

If your case was different, email Contact support with what you saw and we'll update the page. If it worked but took different commands on your platform, we want to know that too.