RUNLOCALAIv38
->Will it run?Best GPUCompareTroubleshootStartLearnPulseModelsHardwareToolsBench
Run check
RUNLOCALAI

Independently operated catalog for local-AI hardware and software. Hand-written verdicts. Source-cited claims. Reproducible commands when we have them.

OP·Fredoline Eruo
DIR
  • Models
  • Hardware
  • Tools
  • Benchmarks
TOOLS
  • Will it run?
  • Compare hardware
  • Cost vs cloud
  • Choose my GPU
  • Prompting kits
  • Quick answers
REF
  • All buyer guides
  • Learn local AI
  • Methodology
  • Glossary
  • Errors KB
  • Trust
EDITOR
  • About
  • Author
  • How we make money
  • Editorial policy
  • Contact
LEGAL
  • Privacy
  • Terms
  • Sitemap
MAIL · MONTHLY DIGEST
Get monthly local AI changes
Monthly recap. No spam.
DISCLOSURE

Some links on this site are affiliate links (Amazon Associates and other first-class retailers). When you buy through them, we earn a small commission at no extra cost to you. Affiliate links do not influence our verdicts — there are cards we rate highly that we don't have affiliate relationships with, and cards that sell well that we refuse to recommend. Read more →

© 2026 runlocalai.coIndependently operated
RUNLOCALAI · v38
Errors / Tokenizer mismatches / Model loaded but tokenizer vocab size mismatch
Tokenizer mismatches

Model loaded but tokenizer vocab size mismatch

Vocab size mismatch: model has X tokens, tokenizer has Y
By Fredoline Eruo · Last verified Jun 12, 2026

Cause

When the model's config.json declares vocab size X but the tokenizer's vocab.json/tokenizer.json has size Y, transformers refuses to load. Causes:

  • Partial download (one file missing)
  • Mixed files from different model versions in the same directory
  • A LoRA-merged model where the merge process didn't update the config

Solution

Re-download the full model to a fresh directory:

hf download meta-llama/Llama-3.1-8B-Instruct --local-dir ./llama-31-8b-clean

If you're sure the files are right but vocab is mismatched (rare, happens with custom-trained models), fix manually:

from transformers import AutoConfig, AutoTokenizer
tok = AutoTokenizer.from_pretrained("./your-model")
config = AutoConfig.from_pretrained("./your-model")
config.vocab_size = len(tok)  # match config to tokenizer
config.save_pretrained("./your-model")

Common after LoRA merging. When you merge a LoRA back into the base, both vocab.json AND config.json need to be the BASE model's files (the LoRA adapter has neither). Don't mix-and-match from different stages of training.

Related errors

  • TypeError: 'NoneType' object is not subscriptable in tokenizer
  • Quantized model produces garbage / never stops generating
  • OSError: Can't load tokenizer for ... / no file named tokenizer.json
  • GGUF model outputs garbage — tokenizer / chat-template mismatch
  • Model produces gibberish or repeats one token forever

Did this fix it?

If your case was different, email Contact support with what you saw and we'll update the page. If it worked but took different commands on your platform, we want to know that too.