Introduction to Local AI on Windows — Local AI on Windows (Chapter 1)

Windows is not Linux. That sounds obvious, but it creates specific problems when you try to run local AI tooling that was originally built for Unix environments. Most AI runtimes—Ollama, llama.cpp, text-generation-webui—are primarily distributed as Linux binaries. On Windows you have three paths: WSL2 (Windows Subsystem for Linux 2), native Windows builds, or Docker containers. Each has tradeoffs.

WSL2 gives you a full Linux kernel running as a lightweight VM. GPU passthrough works through NVIDIA's WSL2 driver, so CUDA workloads run at near-native speed. The main inconvenience is that you are constantly switching between Windows and Linux file systems: /home/user/ inside WSL2 is not C:\Users\user\ on the Windows side, and the two sides have different line endings, path separators, and permission models.

Native Windows builds exist for some tools—Ollama ships a Windows installer, LM Studio has a Windows binary. These are easier to manage from File Explorer but harder to integrate with Docker and often miss Linux-specific optimizations. Docker Desktop on Windows runs inside a WSL2 backend by default, which means containers see the Linux environment, not the Windows environment.

The Windows-specific failure modes that bite most often: antivirus scans blocking executable files in %LOCALAPPDATA%, Windows Defender interfering with model file downloads, Hyper-V conflicting with WSL2, and memory overcommit when WSL2 and Docker both try to use 16 GB of RAM. We address all of these in subsequent chapters.

Local verification checkpoint

Run the smallest example from this chapter in a local workspace and record the package version, runtime, data path, and observed output. If the result depends on model size, vector count, CPU/GPU backend, or available memory, note that constraint beside the exercise so the lesson remains reproducible.