What plugs into your local AI runtime. 37 curated apps across 12 categories — chat UIs, coding agents, RAG pipelines, voice, image, browser extensions, editor plugins, mobile + desktop, agent frameworks, productivity, SDK wrappers.
Each entry carries an honest editorial verdict — pros, cons, the runtime it works against, the minimum VRAM, and the privacy posture. Filter to your stack, jump to the detail page, ship.
URL updates as you change filters — share or bookmark a result. All filters are server-rendered, so the page works without JS.
23 of 37 apps matching your filters
The default chat UI for solo Ollama users. Multi-model, built-in RAG, web search, Docker-friendly.
“Best default chat UI for solo Ollama users. Pick this first; switch only if you outgrow it.”
Terminal coding agent that edits files via your local model. Git-aware, surgical, fast.
“Best terminal-native coding agent for local models. Qwen 2.5 Coder 32B is its sweet spot.”
Desktop app that bundles model download + chat + OpenAI-compatible local server. Closed-source but free.
“Best 'first install' desktop app for newcomers. Closed-source but the easiest first-run experience.”
Privacy-first desktop chat with a curated model catalog. Llama / Mistral / Qwen one click from the app.
“Best one-binary desktop chat. Curated catalog removes 'which model?' decision paralysis.”
Air-gappable RAG over your docs. The OG offline-RAG project, now mature and team-friendly.
“Best when air-gap compliance is the requirement. Less polished than AnythingLLM, more configurable.”
Free, native macOS / iOS Stable Diffusion app. Runs SD3, Flux on a phone (yes, really).
“Best mobile + macOS SD app. Free, native, no Python — runs Flux on Apple Silicon impressively well.”
Nomic's free desktop AI with model catalog + chat + Python SDK. Long-standing, open-source.
“Best fully-open-source desktop AI bundler. Less polished than LM Studio, fully MIT.”
Official Python SDK for Ollama. Async, streaming, typed — the right primitive for scripts.
“Foundational primitive for Python scripts against Ollama. Official, maintained, typed.”
Self-hosted coding agent server with team SSO, audit logs, and dashboards. Enterprise-grade.
“Best self-hosted server for teams. SSO + audit logs make it the IT-friendly pick.”
Native iOS / macOS Ollama client. Beautiful SwiftUI, talks to your home Ollama server.
“Best mobile Ollama client. Native SwiftUI; works against your home Ollama server.”
Native macOS app for Whisper transcription. Drag a file in, get a transcript out.
“Best Whisper desktop app on macOS. Pay once, transcribe locally forever.”
Local semantic search across all your Obsidian notes. Embed-once, query-fast, fully offline.
“Best local semantic search for personal notes. Foundational layer for Obsidian RAG.”
Open-source Whisper transcription with mic + file modes. Cross-platform Qt app.
“Best open-source Whisper desktop app. Cross-platform, free, less polish than MacWhisper.”
Krita plugin that wires ComfyUI into a real digital-art workflow. Inpaint, outpaint, upscale.
“Best 'SD as digital-art tool' integration. Real Krita workflow, not a wrapper UI.”
Official Node + browser SDK for Ollama. ESM-first, typed, streaming.
“Foundational primitive for Node + browser apps against Ollama. ESM-native, typed.”
Browser sidebar that talks to your local Ollama. Summarize pages, chat, vision support.
“Best 'sidebar AI' browser extension that's truly local-first.”
AI note-taking app that builds connections between your notes automatically. Local, open-source.
“Best AI-first note app that's actually local. Niche but well-executed.”
Free, lightweight VS Code copilot that runs entirely on Ollama. Strong on autocomplete.
“Best minimal-surface Copilot-replacement that's been Ollama-native since day one.”
One-click Stable Diffusion app for macOS. No setup, just run.
“Easiest macOS SD app — picks defaults so you don't have to.”
Drop-in OpenAI TTS-compatible server. Self-hosted, talks to local voice models.
“Best 'drop-in local TTS for OpenAI clients'. Bridge solution for existing pipelines.”
Android Ollama client + on-device fallback for small models. Cross-platform Flutter.
“Best cross-platform Android-friendly Ollama client. Falls back to on-device for tiny models.”
Codeium self-hosted enterprise backend lets the popular IDE plugin run fully on your hardware.
“Best 'enterprise Copilot' replacement when self-hosting is mandatory. Paid tier.”
Terminal entry into Khoj's local AI assistant. Use grep, get answers, never leave the shell.
“Best terminal companion for note-summarization workflows. Pipe-friendly.”
We curate this directory editorially — same review queue as the benchmarks feed. Open an issue with the project link and your one-line pitch for why it belongs.
Editorial review applies — same standards as the rest of the site. We won't list apps that don't actually work against a local runtime, regardless of marketing claims.
The runtime layer: Ollama, vLLM, llama.cpp, MLX, LM Studio server, ComfyUI. What apps in this directory talk to.
Tell us your use case, get a full rig recipe — runtime + models + the apps from this directory that fit your stack.
Match an app's minimum-VRAM requirement to real hardware with our price/perf comparison.
Real operator submissions on the model × hardware × app combos that work. The proof behind the editorial picks.