RUNLOCALAIv38
->Will it run?Best GPUCompareTroubleshootStartLearnPulseModelsHardwareToolsBench
Run check
RUNLOCALAI

Independently operated catalog for local-AI hardware and software. Hand-written verdicts. Source-cited claims. Reproducible commands when we have them.

OP·Fredoline Eruo
DIR
  • Models
  • Hardware
  • Tools
  • Benchmarks
TOOLS
  • Will it run?
  • Compare hardware
  • Cost vs cloud
  • Choose my GPU
  • Prompting kits
  • Quick answers
REF
  • All buyer guides
  • Learn local AI
  • Methodology
  • Glossary
  • Errors KB
  • Trust
EDITOR
  • About
  • Author
  • How we make money
  • Editorial policy
  • Contact
LEGAL
  • Privacy
  • Terms
  • Sitemap
MAIL · MONTHLY DIGEST
Get monthly local AI changes
Monthly recap. No spam.
DISCLOSURE

Some links on this site are affiliate links (Amazon Associates and other first-class retailers). When you buy through them, we earn a small commission at no extra cost to you. Affiliate links do not influence our verdicts — there are cards we rate highly that we don't have affiliate relationships with, and cards that sell well that we refuse to recommend. Read more →

© 2026 runlocalai.coIndependently operated
RUNLOCALAI · v38
  1. >
  2. Home
  3. /Troubleshooting
  4. /WSL2 OOM-killer killing inference / 'Killed' message
fatal✓Editorial·Reviewed May 2026

WSL OOM — give Linux enough RAM to load big models

WSL2 inherits a fraction of host RAM by default and won't let processes exceed it. Edit .wslconfig to set `memory=32GB` (or whatever you need) and restart WSL. Then verify with `free -h` inside the distro.

WSL2Ubuntu on WSLDocker Desktop on WindowsLinux memory subsystem
By Fredoline Eruo · Last verified 2026-05-08

Diagnostic order — most likely first

#1

WSL .wslconfig memory limit too low

Diagnose

Inside WSL: `free -h` shows total RAM ≪ host RAM. Default historically was 50% of host (capped 8 GB on some configs). Modern WSL defaults are higher but still bounded.

Fix

Edit `%USERPROFILE%\.wslconfig` (create if missing). Set: ``` [wsl2] memory=32GB swap=8GB ``` Then `wsl --shutdown` from PowerShell, restart your distro. Verify with `free -h`.

#2

WSL swap exhausted

Diagnose

`free -h` shows swap = 0 or very low. Model loads fine but inference hits OOM-killer mid-run because activations need swap headroom.

Fix

Set swap in .wslconfig: `swap=8GB` (or larger for 70B+ models). Restart WSL. For sustained large workloads, consider `swapfile=D:\\wsl-swap.vhdx` to put swap on a different drive.

#3

Multiple WSL distros sharing the budget

Diagnose

`wsl --list --verbose` shows multiple Running distros. They all share the .wslconfig memory limit.

Fix

Shut down distros you're not using: `wsl --terminate <DistroName>`. Or set per-distro limits via `/etc/wsl.conf` inside each distro (newer WSL versions support per-distro overrides).

#4

Docker Desktop reserving too much

Diagnose

Docker Desktop is installed and running. Its WSL backend reserves a chunk of the WSL memory budget that other distros can't access.

Fix

Docker Desktop → Settings → Resources → WSL Integration. Lower Docker's memory allocation. Or move the inference workload outside Docker (run directly in your distro).

#5

Genuine system RAM insufficient

Diagnose

Host PC has 32 GB total RAM. You're trying to load a 70B Q4 model (~40 GB). The math doesn't work.

Fix

Smaller model. Or buy more RAM (DDR5-5600 32GB modules are ~$100 each in 2026). Or upgrade to a 24 GB GPU and skip system-RAM offload entirely.

Best AI PC build under $2,000 →
If this keeps happening — the next decision is hardware

WSL OOM resolves to either system RAM or VRAM — both are hardware decisions. The guide below frames the right next tier.

  • best budget GPU for local AI
  • 16 GB vs 24 GB VRAM for local AI

Frequently asked questions

How do I configure WSL memory limit?

Create `C:\Users\<you>\.wslconfig` with `[wsl2]` section. Set `memory=32GB` (or your target) + `swap=8GB`. Run `wsl --shutdown` in PowerShell, restart the distro. Verify with `free -h` inside WSL.

WSL2 vs native Linux — performance gap for AI?

Within 1-3% on inference workloads in 2026. WSL2 GPU passthrough is mature; the kernel + filesystem overhead is negligible. The wins of WSL (Windows file sharing, easier OS workflow) usually outweigh the tiny perf delta. For training at scale, native Linux still wins.

Can I run 70B in WSL with 32 GB system RAM?

Tight. 70B Q4 GGUF is ~40 GB. With 32 GB system RAM (allocate 28 GB to WSL via .wslconfig), you'll need substantial swap or full GPU offload. Practical answer: more RAM, or run inference on the host directly (skip WSL for memory-tight workloads).

Related troubleshooting

WSL2 cannot see GPU / nvidia-smi fails inside WSL

WSL2 doesn't pass the GPU through unless the host driver is right and the kernel is current. Here's the install order that actually works in 2026, and how to confirm passthrough is live before you waste an afternoon.

CUDA out of memory

Why CUDA OOM happens during local LLM inference and image gen, how to confirm the real cause, and the four real fixes (smaller quant, shorter context, gradient checkpointing, or more VRAM).

Model keeps crashing / segfault during inference

Mid-inference crashes (segfault, illegal memory access, kernel panic) usually mean VRAM ECC, thermal throttling, PSU instability, or a bad model file. Here's the diagnostic order.

When the fix is hardware

A surprising fraction of troubleshooting tickets resolve to: this card doesn't have enough VRAM for what you're asking it to do. If you're hitting OOM after every reasonable fix, or your GPU genuinely can't fit the model you need, it's upgrade time:

  • Best GPU for local AI
  • Best laptop for local AI
  • Best Mac for local AI

Where next?

All troubleshooting guides
OrBest GPU for local AIWill it run on my hardware?