15. Windows AI Tools Ecosystem
Windows has a growing list of native AI tools beyond Ollama. Here is how they fit together and where they overlap.
text-generation-webui (oobabooga): Python-based Gradio UI for running models. Install via Git:
git clone https://github.com/oobabooga/text-generation-webui
cd text-generation-webui
pip install -r requirements.txt
python server.py
Requires Python 3.11+ inside WSL2. GPU acceleration requires bitsandbytes or cuBLAS. Supports GGUF loading through the llama.cpp_cuda branch. This is the most flexible option but requires more manual configuration than Ollama.
Jan:
A native Windows electron app (also available on Mac and Linux) that provides a local AI interface similar to ChatGPT. Download from jan.ai. It bundles a server backend and a chat UI in one executable. Models download from Hugging Face directly into C:\Users\USER\AppData\Roaming\jan\models. It runs as a background process and exposes an OpenAI-compatible API on port 1337.
LocalAI: A drop-in OpenAI API replacement that runs locally. Use it when you want to test code written against the OpenAI API against a local model without changing application code.
docker run -d \
-p 8080:8080 \
-v /mnt/d/models:/models \
--name localai \
quay.io/go-skynet/local-ai:latest
Create a model config at /mnt/d/models/llama3.yaml:
name: llama3.2:1b
backend: llama
model: llama3.2-1b.Q4_K_M.gguf
parameters:
temperature: 0.7
top_p: 0.9
context_size: 2048
f16: true
threads: 8
gpu_layers: 35
Test with:
curl http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "llama3.2:1b",
"messages": [{"role":"user","content":"What year was Python 3 released?"}]
}'
Tool selection summary:
| Tool | Interface | Model Format | Best For |
|---|---|---|---|
| Ollama | CLI + Web | .bin (custom) | Fast setup, CLI-first |
| LM Studio | GUI | GGUF | Non-technical users |
| Open WebUI | Web | Ollama | Multi-user, RAG |
| Jan | GUI + API | GGUF, native | Chat interface, API backend |
| text-generation-webui | Web | GGUF, safetensors | Maximum configuration |
| LocalAI | API | GGUF | API compatibility testing |
Install Jan from jan.ai, download a model, start the API server, and confirm it responds to a curl request to http://localhost:1337/v1/chat/completions. Compare the model download location with Ollama's download location.