What this does

Monitoring memory usage during inference helps diagnose out-of-memory errors, identify memory leaks, and tune offloading settings for optimal performance.

Steps

Monitor GPU memory in real-time.
```
nvidia-smi --query-gpu=memory.used,memory.total,utilization.gpu --format=csv -l 1
```
Expected: Live-updating table showing VRAM usage and GPU utilization.

Log memory to a file during a benchmark run.

# Start logging in background
nvidia-smi --query-gpu=memory.used --format=csv,noheader -lms 500 > gpu_mem_log.csv &
LOG_PID=$!
# Run inference
./llama-cli -m model.gguf -p "Long prompt here" -n 512
# Stop logging
kill $LOG_PID

Monitor CPU memory on Windows.

Get-Process -Name ollama | Select-Object WorkingSet64, PrivateMemorySize64
# Or watch total system memory
while ($true) { Get-Counter "\Memory\Available MBytes"; Start-Sleep 1 }

Plot memory usage over time.

import pandas as pd, matplotlib.pyplot as plt
df = pd.read_csv("gpu_mem_log.csv", header=None, names=["memory_mb"])
df.plot()
plt.ylabel("GPU Memory (MB)")
plt.savefig("memory_profile.png")

Verification

# Check the log has timestamps increasing during inference
Get-Content gpu_mem_log.csv | Select-Object -First 5
# Expected: rising memory values as model loads, plateau during generation

Common failures

nvidia-smi shows zero GPU activity: The model may be running entirely on CPU. Check --n-gpu-layers setting.
Sampling interval too fast: -lms 100 (100ms) can miss events. Use 500ms for accurate captures.
Permission denied: On Linux, nvidia-smi may need sudo for certain metrics. On Windows, run PowerShell as Administrator.

Operator checkpoint

Before treating this as solved, write down the local runtime, model or package version, hardware/backend if relevant, and the verification output. This keeps the guide useful as a Will-It-Run style decision instead of a one-off command transcript.

How to monitor CPU and GPU memory during inference

What this does

Steps

Verification

Common failures

Operator checkpoint

Operator checkpoint

Related guides