WSL OOM — give Linux enough RAM to load big models
WSL2 inherits a fraction of host RAM by default and won't let processes exceed it. Edit .wslconfig to set `memory=32GB` (or whatever you need) and restart WSL. Then verify with `free -h` inside the distro.
Diagnostic order — most likely first
WSL .wslconfig memory limit too low
Inside WSL: `free -h` shows total RAM ≪ host RAM. Default historically was 50% of host (capped 8 GB on some configs). Modern WSL defaults are higher but still bounded.
Edit `%USERPROFILE%\.wslconfig` (create if missing). Set: ``` [wsl2] memory=32GB swap=8GB ``` Then `wsl --shutdown` from PowerShell, restart your distro. Verify with `free -h`.
WSL swap exhausted
`free -h` shows swap = 0 or very low. Model loads fine but inference hits OOM-killer mid-run because activations need swap headroom.
Set swap in .wslconfig: `swap=8GB` (or larger for 70B+ models). Restart WSL. For sustained large workloads, consider `swapfile=D:\\wsl-swap.vhdx` to put swap on a different drive.
Multiple WSL distros sharing the budget
`wsl --list --verbose` shows multiple Running distros. They all share the .wslconfig memory limit.
Shut down distros you're not using: `wsl --terminate <DistroName>`. Or set per-distro limits via `/etc/wsl.conf` inside each distro (newer WSL versions support per-distro overrides).
Docker Desktop reserving too much
Docker Desktop is installed and running. Its WSL backend reserves a chunk of the WSL memory budget that other distros can't access.
Docker Desktop → Settings → Resources → WSL Integration. Lower Docker's memory allocation. Or move the inference workload outside Docker (run directly in your distro).
Genuine system RAM insufficient
Host PC has 32 GB total RAM. You're trying to load a 70B Q4 model (~40 GB). The math doesn't work.
Smaller model. Or buy more RAM (DDR5-5600 32GB modules are ~$100 each in 2026). Or upgrade to a 24 GB GPU and skip system-RAM offload entirely.
WSL OOM resolves to either system RAM or VRAM — both are hardware decisions. The guide below frames the right next tier.
Frequently asked questions
How do I configure WSL memory limit?
Create `C:\Users\<you>\.wslconfig` with `[wsl2]` section. Set `memory=32GB` (or your target) + `swap=8GB`. Run `wsl --shutdown` in PowerShell, restart the distro. Verify with `free -h` inside WSL.
WSL2 vs native Linux — performance gap for AI?
Within 1-3% on inference workloads in 2026. WSL2 GPU passthrough is mature; the kernel + filesystem overhead is negligible. The wins of WSL (Windows file sharing, easier OS workflow) usually outweigh the tiny perf delta. For training at scale, native Linux still wins.
Can I run 70B in WSL with 32 GB system RAM?
Tight. 70B Q4 GGUF is ~40 GB. With 32 GB system RAM (allocate 28 GB to WSL via .wslconfig), you'll need substantial swap or full GPU offload. Practical answer: more RAM, or run inference on the host directly (skip WSL for memory-tight workloads).
Related troubleshooting
WSL2 doesn't pass the GPU through unless the host driver is right and the kernel is current. Here's the install order that actually works in 2026, and how to confirm passthrough is live before you waste an afternoon.
Why CUDA OOM happens during local LLM inference and image gen, how to confirm the real cause, and the four real fixes (smaller quant, shorter context, gradient checkpointing, or more VRAM).
Mid-inference crashes (segfault, illegal memory access, kernel panic) usually mean VRAM ECC, thermal throttling, PSU instability, or a bad model file. Here's the diagnostic order.
When the fix is hardware
A surprising fraction of troubleshooting tickets resolve to: this card doesn't have enough VRAM for what you're asking it to do. If you're hitting OOM after every reasonable fix, or your GPU genuinely can't fit the model you need, it's upgrade time: