RUNLOCALAIv38
->Will it run?Best GPUCompareTroubleshootStartLearnPulseModelsHardwareToolsBench
Run check
RUNLOCALAI

Independently operated catalog for local-AI hardware and software. Hand-written verdicts. Source-cited claims. Reproducible commands when we have them.

OP·Fredoline Eruo
DIR
  • Models
  • Hardware
  • Tools
  • Benchmarks
TOOLS
  • Will it run?
  • Compare hardware
  • Cost vs cloud
  • Choose my GPU
  • Prompting kits
  • Quick answers
REF
  • All buyer guides
  • Learn local AI
  • Methodology
  • Glossary
  • Errors KB
  • Trust
EDITOR
  • About
  • Author
  • How we make money
  • Editorial policy
  • Contact
LEGAL
  • Privacy
  • Terms
  • Sitemap
MAIL · MONTHLY DIGEST
Get monthly local AI changes
Monthly recap. No spam.
DISCLOSURE

Some links on this site are affiliate links (Amazon Associates and other first-class retailers). When you buy through them, we earn a small commission at no extra cost to you. Affiliate links do not influence our verdicts — there are cards we rate highly that we don't have affiliate relationships with, and cards that sell well that we refuse to recommend. Read more →

© 2026 runlocalai.coIndependently operated
RUNLOCALAI · v38
  1. >
  2. Home
  3. /Learn
  4. /Courses
  5. /Troubleshooting Local AI
  6. /Ch. 3
Troubleshooting Local AI

03. GPU Not Detected

Chapter 3 of 15 · 20 min
KEY INSIGHT

"GPU not detected" is three different problems depending on where the detection fails. `lspci` checks hardware, `nvidia-smi` checks the driver, and PyTorch's `cuda.is_available()` checks the runtime. Solve each in sequence.

The Diagnostic Sequence

GPU detection failures cascade from hardware through application. Work through each step before concluding the GPU is working.

Step 1: Hardware Check

lspci | grep -i nvidia

If this returns nothing, the GPU is not visible to the Linux kernel. This means a physical problem (not seated, not powered, BIOS setting) rather than a software problem. Check PCIe visibility in BIOS/UEFI settings.

Step 2: Driver Check

nvidia-smi

If this fails with "command not found", the NVIDIA driver is not installed. If it fails with "No devices were found", the driver loaded but did not detect the GPU—typically a driver-GPU version mismatch or a kernel module loading failure.

# Check loaded kernel modules
lsmod | grep nvidia
# Check dmesg for GPU-related errors
sudo dmesg | grep -i nvidia
sudo dmesg | grep -i nv

Step 3: CUDA Runtime Check

nvidia-smi
# Should show GPU model, driver version, temperature, memory usage

Then verify from Python:

python -c "import torch; print(f'CUDA available: {torch.cuda.is_available()}'); print(f'Device count: {torch.cuda.device_count()}'); print(f'Device name: {torch.cuda.get_device_name(0) if torch.cuda.is_available() else \"N/A\"}')"

Step 4: Container Check

docker run --rm --gpus all nvidia/cuda:12.0.0-base-ubuntu22.04 nvidia-smi

If this fails but nvidia-smi works on the host, NVIDIA Container Toolkit failed to install or configure correctly.

Common Causes and Fixes

Driver version too old: New GPUs require recent drivers. RTX 40-series cards need driver 535+.

Secure Boot blocking driver: The NVIDIA driver kernel module signed by Secure Boot prevents the driver from loading. Disable Secure Boot in UEFI or sign the module manually.

Docker without NVIDIA runtime: Add "default-runtime": "nvidia" to /etc/docker/daemon.json or use --gpus all on every docker run command.

EXERCISE

On your system, run the diagnostic sequence from hardware check through container check. Document the output of each command. When you understand what each command checks, you know exactly where to look when GPU detection fails.

← Chapter 2
Installation Failures
Chapter 4 →
OOM Errors