RUNLOCALAIv38
->Will it run?Best GPUCompareTroubleshootStartLearnPulseModelsHardwareToolsBench
Run check
RUNLOCALAI

Independently operated catalog for local-AI hardware and software. Hand-written verdicts. Source-cited claims. Reproducible commands when we have them.

OP·Fredoline Eruo
DIR
  • Models
  • Hardware
  • Tools
  • Benchmarks
TOOLS
  • Will it run?
  • Compare hardware
  • Cost vs cloud
  • Choose my GPU
  • Prompting kits
  • Quick answers
REF
  • All buyer guides
  • Learn local AI
  • Methodology
  • Glossary
  • Errors KB
  • Trust
EDITOR
  • About
  • Author
  • How we make money
  • Editorial policy
  • Contact
LEGAL
  • Privacy
  • Terms
  • Sitemap
MAIL · MONTHLY DIGEST
Get monthly local AI changes
Monthly recap. No spam.
DISCLOSURE

Some links on this site are affiliate links (Amazon Associates and other first-class retailers). When you buy through them, we earn a small commission at no extra cost to you. Affiliate links do not influence our verdicts — there are cards we rate highly that we don't have affiliate relationships with, and cards that sell well that we refuse to recommend. Read more →

© 2026 runlocalai.coIndependently operated
RUNLOCALAI · v38
  1. >
  2. Home
  3. /Learn
  4. /Courses
  5. /Troubleshooting Local AI
  6. /Ch. 15
Troubleshooting Local AI

15. Troubleshooting Runbook Project

Chapter 15 of 15 · 20 min
KEY INSIGHT

A runbook is not documentation you write once—it's documentation you update every time you solve a new problem. After each debugging session, spend 5 minutes adding the fix to your runbook. Six months later, you'll thank yourself. ## Completion Criteria You have completed this course when you can: - Run the full GPU diagnostic sequence and interpret each command's output - Identify which system layer (hardware, driver, runtime, application) is responsible for any given error - Fix the 10 most common local AI errors from memory rather than by searching - Build a runbook that documents your specific system's configuration and recurring fixes - Profile inference performance and identify the bottleneck (compute, memory bandwidth, or transfer) These skills are not about memorizing error messages—they are about developing a mental model of how local AI systems stack, so diagnosing a new error takes minutes instead of hours.

Building Your Personal Runbook

A runbook documents your specific system's configuration, recurring problems, and their fixes. Generic documentation covers your hardware; your runbook covers your system.

Runbook Template

# System: [Hostname/Description]
## Hardware
- GPU: [Model, VRAM]
- RAM: [Total]
- OS: [Distribution, Kernel version]

## Common Problems

### Problem: Ollama returns "connection refused"
**Symptoms**: curl http://localhost:11434/api/tags fails
**Cause**: Ollama not running
**Fix**: 
```bash
sudo systemctl restart ollama
sudo systemctl status ollama

Problem: Model loads but inference is slow

Symptoms: <5 tokens/second on 7B model Cause: Running on CPU instead of GPU Fix:

## Verify GPU detection
python -c "import torch; print(torch.cuda.is_available())"
## Check environment variables
echo $CUDA_VISIBLE_DEVICES

Installation Notes

  • CUDA Version: 12.1
  • Driver Version: 535.154.05
  • Ollama Version: 0.1.26

Model Registry

Model Size Quantization Location
Llama-2-7B 13B Q4_K_M /models/llama-2-7b-q4


## Completion Criteria

You have completed this course when you can:

- Run the full GPU diagnostic sequence and interpret each command's output
- Identify which system layer (hardware, driver, runtime, application) is responsible for any given error
- Fix the 10 most common local AI errors from memory rather than by searching
- Build a runbook that documents your specific system's configuration and recurring fixes
- Profile inference performance and identify the bottleneck (compute, memory bandwidth, or transfer)

These skills are not about memorizing error messages—they are about developing a mental model of how local AI systems stack, so diagnosing a new error takes minutes instead of hours.
EXERCISE

Build a runbook for your system. Document hardware spec (GPU model, VRAM, driver version), installed AI frameworks with versions, and the three most common errors you've encountered (symptoms, cause, fix). Then create a shell script that generates a diagnostic report and save its output as your baseline.

Key Insight: A runbook is not documentation you write once—it's documentation you update every time you solve a new problem. After each debugging session, spend 5 minutes adding the fix to your runbook. Six months later, you'll thank yourself.

Completion Criteria

You have completed this course when you can:

  • Run the full GPU diagnostic sequence and interpret each command's output
  • Identify which system layer (hardware, driver, runtime, application) is responsible for any given error
  • Fix the 10 most common local AI errors from memory rather than by searching
  • Build a runbook that documents your specific system's configuration and recurring fixes
  • Profile inference performance and identify the bottleneck (compute, memory bandwidth, or transfer)

These skills are not about memorizing error messages—they are about developing a mental model of how local AI systems stack, so diagnosing a new error takes minutes instead of hours.

← Chapter 14
Community Resources
Course complete →
Browse all courses