03. WSL2 Memory and Performance Tuning
WSL2 by default consumes up to 50% of your system RAM or 8 GB, whichever is lower. On a machine with 32 GB this is fine. On a machine with 16 GB running Ollama, Docker, and a browser simultaneously, WSL2 memory growth will eventually hit the limit, trigger the Linux OOM killer, and crash your inference session without warning.
Check current memory use inside WSL2:
free -h
# total used free shared buff/cache available
# Mem: 7.7Gi 3.2Gi 4.4Gi 0.0Ki 1.0Gi 4.4Gi
The .wslconfig file at C:\Users\YOUR_USERNAME\.wslconfig controls WSL2's memory and CPU allocation. Create it if it does not exist:
[wsl2]
memory=12GB
processors=8
swap=8GB
localhostForwarding=true
This limits WSL2 to 12 GB RAM with 8 GB swap on the WSL2 virtual disk. The swap file lives in %USERPROFILE%\AppData\Local\wsl\swap.vhdx and grows dynamically. If the swap file reaches 100% and WSL2 cannot expand it (disk full or quota reached), you get silent failures.
Apply changes by shutting down WSL2:
wsl --shutdown
Then reopen Ubuntu. The new limits take effect immediately.
Performance-critical setting: localhostForwarding=true lets you access WSL2 services (Ollama on port 11434, for example) from the Windows host browser at http://localhost:11434. If this is false, you must use the WSL2 internal IP address, which changes after every wsl --shutdown.
To find the current WSL2 IP:
hostname -I | awk '{print $1}'
Store this in an environment variable in PowerShell:
$env:WSL_IP = (wsl.exe hostname -I | ForEach-Object { $_.Split(' ')[0] })
Local verification checkpoint
Run the smallest example from this chapter in a local workspace and record the package version, runtime, data path, and observed output. If the result depends on model size, vector count, CPU/GPU backend, or available memory, note that constraint beside the exercise so the lesson remains reproducible.
Read your current .wslconfig, double the memory allocation, run wsl --shutdown, confirm the new limit in free -h inside Ubuntu, then restore the original value.