Tokenizer mismatches
Model loaded but tokenizer vocab size mismatch
Vocab size mismatch: model has X tokens, tokenizer has Y
By Fredoline Eruo · Last verified Jun 12, 2026
Cause
When the model's config.json declares vocab size X but the tokenizer's vocab.json/tokenizer.json has size Y, transformers refuses to load. Causes:
- Partial download (one file missing)
- Mixed files from different model versions in the same directory
- A LoRA-merged model where the merge process didn't update the config
Solution
Re-download the full model to a fresh directory:
hf download meta-llama/Llama-3.1-8B-Instruct --local-dir ./llama-31-8b-clean
If you're sure the files are right but vocab is mismatched (rare, happens with custom-trained models), fix manually:
from transformers import AutoConfig, AutoTokenizer
tok = AutoTokenizer.from_pretrained("./your-model")
config = AutoConfig.from_pretrained("./your-model")
config.vocab_size = len(tok) # match config to tokenizer
config.save_pretrained("./your-model")
Common after LoRA merging. When you merge a LoRA back into the base, both vocab.json AND config.json need to be the BASE model's files (the LoRA adapter has neither). Don't mix-and-match from different stages of training.
Related errors
- TypeError: 'NoneType' object is not subscriptable in tokenizer
- Quantized model produces garbage / never stops generating
- OSError: Can't load tokenizer for ... / no file named tokenizer.json
- GGUF model outputs garbage — tokenizer / chat-template mismatch
- Model produces gibberish or repeats one token forever
Did this fix it?
If your case was different, email Contact support with what you saw and we'll update the page. If it worked but took different commands on your platform, we want to know that too.