RUNLOCALAIv38
->Will it run?Best GPUCompareTroubleshootStartLearnPulseModelsHardwareToolsBench
Run check
RUNLOCALAI

Independently operated catalog for local-AI hardware and software. Hand-written verdicts. Source-cited claims. Reproducible commands when we have them.

OP·Fredoline Eruo
DIR
  • Models
  • Hardware
  • Tools
  • Benchmarks
TOOLS
  • Will it run?
  • Compare hardware
  • Cost vs cloud
  • Choose my GPU
  • Prompting kits
  • Quick answers
REF
  • All buyer guides
  • Learn local AI
  • Methodology
  • Glossary
  • Errors KB
  • Trust
EDITOR
  • About
  • Author
  • How we make money
  • Editorial policy
  • Contact
LEGAL
  • Privacy
  • Terms
  • Sitemap
MAIL · MONTHLY DIGEST
Get monthly local AI changes
Monthly recap. No spam.
DISCLOSURE

Some links on this site are affiliate links (Amazon Associates and other first-class retailers). When you buy through them, we earn a small commission at no extra cost to you. Affiliate links do not influence our verdicts — there are cards we rate highly that we don't have affiliate relationships with, and cards that sell well that we refuse to recommend. Read more →

© 2026 runlocalai.coIndependently operated
RUNLOCALAI · v38
Errors / Model format / GGUF / llama.cpp: error loading model — bad magic / unsupported GGUF
Model format / GGUF
Verified by owner

llama.cpp: error loading model — bad magic / unsupported GGUF

llama_model_load: error loading model: failed to load model 'X': bad magic / unsupported GGUF version
By Fredoline Eruo · Last verified Jun 12, 2026

Cause

Environment: llama.cpp and downstream tools (Ollama, LM Studio, koboldcpp) loading a GGUF file built by an incompatible version.

Severity: medium — file is unusable until re-quantized.

  • GGUF v1/v2 file loaded by a llama.cpp build that only supports v3+
  • Newer GGUF v3 file with new metadata loaded by old llama.cpp
  • File is actually a different format renamed to .gguf (e.g. AWQ safetensors)
  • Partial download — first bytes are HTML from a CDN error page
  • Custom-quantized file with non-standard tensor types your build wasn't compiled with

Solution

1. Verify the file isn't corrupted — first 4 bytes should be GGUF:

xxd model.gguf | head -1
# 00000000: 4747 5546 ...   (i.e. "GGUF")

If you see HTML or zeros, redownload.

2. Update llama.cpp — most "bad magic" errors are stale builds:

cd llama.cpp
git pull
make clean
GGML_CUDA=1 make -j

3. Re-quantize from the original safetensors with a current llama-quantize:

# Convert HF model to GGUF F16
python convert_hf_to_gguf.py /path/to/model --outfile model-f16.gguf
# Quantize to your target
./llama-quantize model-f16.gguf model-Q4_K_M.gguf Q4_K_M

4. Pull a known-good GGUF from a maintained repo:

hf download bartowski/Meta-Llama-3.1-8B-Instruct-GGUF \
  Meta-Llama-3.1-8B-Instruct-Q4_K_M.gguf

5. If the file came from a vendor build, check their llama.cpp commit pin in the model card and match it:

git checkout <commit-sha>
make clean && make -j

Related errors

  • llama.cpp: failed to mmap GGUF file
  • Failed to load model: GGUF version mismatch

Did this fix it?

If your case was different, email Contact support with what you saw and we'll update the page. If it worked but took different commands on your platform, we want to know that too.