Model Download Failures — Troubleshooting Local AI (Chapter 5)

Hugging Face Hub Failures

Model downloads fail most commonly due to network issues, incomplete downloads, or filesystem permission problems.

# Download with verbose output to see where it fails
HF_HUB_ENABLE_HF_TRANSFER=1 huggingface-cli download \
  meta-llama/Llama-2-7b-hf \
  --local-dir /models/llama-2-7b-hf \
  --local-dir-use-symlinks False \
  -v

Handling Rate Limits

# Set your access token to increase rate limits
export HF_TOKEN="hf_your_token_here"
huggingface-cli download meta-llama/Llama-2-7b-hf --token $HF_TOKEN

If rate limited, wait 60 minutes or use git-based downloading:

git lfs install
GIT_LFS_SKIP_SMUDGE=1 git clone https://huggingface.co/meta-llama/Llama-2-7b-hf
cd Llama-2-7b-hf
git lfs pull

Verifying Downloads

Corrupted downloads cause runtime failures that are difficult to distinguish from model file bugs.

# Compare file sizes
ls -lh models/llama-2-7b-hf/pytorch_model-*.bin | head -5

# Verify safetensors integrity
python -c "
from safetensors.torch import load_file
import glob
for f in glob.glob('models/llama-2-7b-hf/*.safetensors'):
    try:
        load_file(f)
        print(f'OK: {f}')
    except Exception as e:
        print(f'CORRUPT: {f} - {e}')
"

Local verification checkpoint

Run the smallest example from this chapter in a local workspace and record the package version, runtime, data path, and observed output. If the result depends on model size, vector count, CPU/GPU backend, or available memory, note that constraint beside the exercise so the lesson remains reproducible.