What plugs into your local AI runtime. 37 curated apps across 12 categories — chat UIs, coding agents, RAG pipelines, voice, image, browser extensions, editor plugins, mobile + desktop, agent frameworks, productivity, SDK wrappers.
Each entry carries an honest editorial verdict — pros, cons, the runtime it works against, the minimum VRAM, and the privacy posture. Filter to your stack, jump to the detail page, ship.
URL updates as you change filters — share or bookmark a result. All filters are server-rendered, so the page works without JS.
3 of 37 apps matching your filters
Drop-in OpenAI-compatible proxy across 100+ providers. Route to local Ollama or cloud, same code.
“Best universal LLM proxy. Foundational layer for multi-provider deployments.”
Official Python SDK for Ollama. Async, streaming, typed — the right primitive for scripts.
“Foundational primitive for Python scripts against Ollama. Official, maintained, typed.”
Official Node + browser SDK for Ollama. ESM-first, typed, streaming.
“Foundational primitive for Node + browser apps against Ollama. ESM-native, typed.”
We curate this directory editorially — same review queue as the benchmarks feed. Open an issue with the project link and your one-line pitch for why it belongs.
Editorial review applies — same standards as the rest of the site. We won't list apps that don't actually work against a local runtime, regardless of marketing claims.
The runtime layer: Ollama, vLLM, llama.cpp, MLX, LM Studio server, ComfyUI. What apps in this directory talk to.
Tell us your use case, get a full rig recipe — runtime + models + the apps from this directory that fit your stack.
Match an app's minimum-VRAM requirement to real hardware with our price/perf comparison.
Real operator submissions on the model × hardware × app combos that work. The proof behind the editorial picks.