15. Windows AI Tools Ecosystem

Chapter 15 of 15 · 20 min

Windows has a growing list of native AI tools beyond Ollama. Here is how they fit together and where they overlap.

text-generation-webui (oobabooga): Python-based Gradio UI for running models. Install via Git:

git clone https://github.com/oobabooga/text-generation-webui
cd text-generation-webui
pip install -r requirements.txt
python server.py

Requires Python 3.11+ inside WSL2. GPU acceleration requires bitsandbytes or cuBLAS. Supports GGUF loading through the llama.cpp_cuda branch. This is the most flexible option but requires more manual configuration than Ollama.

Jan: A native Windows electron app (also available on Mac and Linux) that provides a local AI interface similar to ChatGPT. Download from jan.ai. It bundles a server backend and a chat UI in one executable. Models download from Hugging Face directly into C:\Users\USER\AppData\Roaming\jan\models. It runs as a background process and exposes an OpenAI-compatible API on port 1337.

LocalAI: A drop-in OpenAI API replacement that runs locally. Use it when you want to test code written against the OpenAI API against a local model without changing application code.

docker run -d \
  -p 8080:8080 \
  -v /mnt/d/models:/models \
  --name localai \
  quay.io/go-skynet/local-ai:latest

Create a model config at /mnt/d/models/llama3.yaml:

name: llama3.2:1b
backend: llama
model: llama3.2-1b.Q4_K_M.gguf
parameters:
  temperature: 0.7
  top_p: 0.9
context_size: 2048
f16: true
threads: 8
gpu_layers: 35

Test with:

curl http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llama3.2:1b",
    "messages": [{"role":"user","content":"What year was Python 3 released?"}]
  }'

Tool selection summary:

Tool Interface Model Format Best For
Ollama CLI + Web .bin (custom) Fast setup, CLI-first
LM Studio GUI GGUF Non-technical users
Open WebUI Web Ollama Multi-user, RAG
Jan GUI + API GGUF, native Chat interface, API backend
text-generation-webui Web GGUF, safetensors Maximum configuration
LocalAI API GGUF API compatibility testing
EXERCISE

Install Jan from jan.ai, download a model, start the API server, and confirm it responds to a curl request to http://localhost:1337/v1/chat/completions. Compare the model download location with Ollama's download location.