Windows AI Tools Ecosystem — Local AI on Windows (Chapter 15)

Windows has a growing list of native AI tools beyond Ollama. Here is how they fit together and where they overlap.

text-generation-webui (oobabooga): Python-based Gradio UI for running models. Install via Git:

git clone https://github.com/oobabooga/text-generation-webui
cd text-generation-webui
pip install -r requirements.txt
python server.py

Requires Python 3.11+ inside WSL2. GPU acceleration requires bitsandbytes or cuBLAS. Supports GGUF loading through the llama.cpp_cuda branch. This is the most flexible option but requires more manual configuration than Ollama.

Jan: A native Windows electron app (also available on Mac and Linux) that provides a local AI interface similar to ChatGPT. Download from jan.ai. It bundles a server backend and a chat UI in one executable. Models download from Hugging Face directly into C:\Users\USER\AppData\Roaming\jan\models. It runs as a background process and exposes an OpenAI-compatible API on port 1337.

LocalAI: A drop-in OpenAI API replacement that runs locally. Use it when you want to test code written against the OpenAI API against a local model without changing application code.

docker run -d \
  -p 8080:8080 \
  -v /mnt/d/models:/models \
  --name localai \
  quay.io/go-skynet/local-ai:latest

Create a model config at /mnt/d/models/llama3.yaml:

name: llama3.2:1b
backend: llama
model: llama3.2-1b.Q4_K_M.gguf
parameters:
  temperature: 0.7
  top_p: 0.9
context_size: 2048
f16: true
threads: 8
gpu_layers: 35

Test with:

curl http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llama3.2:1b",
    "messages": [{"role":"user","content":"What year was Python 3 released?"}]
  }'

Tool selection summary:

Tool	Interface	Model Format	Best For
Ollama	CLI + Web	.bin (custom)	Fast setup, CLI-first
LM Studio	GUI	GGUF	Non-technical users
Open WebUI	Web	Ollama	Multi-user, RAG
Jan	GUI + API	GGUF, native	Chat interface, API backend
text-generation-webui	Web	GGUF, safetensors	Maximum configuration
LocalAI	API	GGUF	API compatibility testing