Tokenizer mismatches
Verified by owner
GGUF model outputs garbage — tokenizer / chat-template mismatch
(no error — generation is fluent gibberish, repeats one token, or emits raw special tokens like <|im_start|>)
By Fredoline Eruo · Last verified Jun 12, 2026
Cause
Environment: llama.cpp, Ollama, LM Studio, koboldcpp running GGUF files.
Severity: medium — model loads but output is unusable.
- GGUF was converted with an old llama.cpp that didn't bundle the right tokenizer (Tekken for Mistral Nemo, GPT-2 BPE merges for Llama 3)
- Chat template baked into the GGUF doesn't match the model — Modelfile override needed
- BOS token policy wrong (
add_bos_tokentrue on a model trained without BOS) - Special tokens (
<|eot_id|>,<|im_end|>) not registered as stop tokens — generation runs past end-of-turn - LoRA-merged GGUF where the merge didn't update tokenizer metadata
Solution
1. Re-pull the GGUF from a maintainer who tracks tokenizer updates:
ollama pull mistral-nemo:12b-instruct-2407
# or
hf download bartowski/Mistral-Nemo-Instruct-2407-GGUF \
Mistral-Nemo-Instruct-2407-Q4_K_M.gguf
2. Inspect the embedded chat template:
./llama-cli -m model.gguf --chat-template-file /dev/stdout --interactive
# or
ollama show llama3.1:8b --modelfile | grep -A5 TEMPLATE
3. Override with the correct template via Modelfile:
FROM ./model.gguf
TEMPLATE """<|begin_of_text|><|start_header_id|>system<|end_header_id|>
{{ .System }}<|eot_id|><|start_header_id|>user<|end_header_id|>
{{ .Prompt }}<|eot_id|><|start_header_id|>assistant<|end_header_id|>
"""
PARAMETER stop "<|eot_id|>"
PARAMETER stop "<|end_of_text|>"
ollama create llama3.1:8b-fixed -f Modelfile
4. Disable BOS injection if model was trained without one:
./llama-cli -m model.gguf --no-bos -p "Hello"
5. Update llama.cpp + reconvert if the tokenizer is genuinely missing tokens (rare but happens with research models):
git pull && make clean && make -j
python convert_hf_to_gguf.py /path/to/hf-model
Related errors
Did this fix it?
If your case was different, email Contact support with what you saw and we'll update the page. If it worked but took different commands on your platform, we want to know that too.