Local AI tools

46 tools reviewed. Runners, GUIs, and servers for every workflow.

Stable Diffusion WebUI (AUTOMATIC1111)

The original Stable Diffusion frontend. Less actively developed in 2026 than ComfyUI but still has the cleanest UX for simple gen.

macOS

Linux

Windows

Our rating: 4.4/5

Ollama

runner

OSS

The default first-pull tool for local AI. One-line model installs (`ollama run llama3.1`), an OpenAI-compatible HTTP API, good defaults out of the box. Built on llama.cpp.

LangChain

orchestrator

OSS

Python/JS framework for chains, agents, and RAG. Batteries-included but heavyweight; many graduate to LangGraph or DIY.

any

Our rating: 4/5

llama.cpp

runner

OSS

The bedrock of local LLM inference. Most other tools wrap or embed it. Maximum control, maximum platform support, sharpest learning curve.

Open WebUI

gui

OSS

Self-hosted ChatGPT-style web frontend. Pairs with Ollama or any OpenAI-compatible backend. Multi-user, RAG built in, fast.

GPT4All

gui

OSS

One of the original local-LLM apps from Nomic. Privacy-focused, runs on CPU, decent model library. Pace of development has slowed compared to Jan/Msty.

ComfyUI

gui

OSS

Node-graph image-generation UI. Standard for Stable Diffusion and Flux workflows. Endlessly customizable.

Zed (with AI)

ide

OSS

High-performance native editor from the Atom team, with built-in AI panel and inline assistant. BYO API key for any provider.

Open Interpreter

orchestrator

OSS

Lets LLMs execute code locally — Python, shell, AppleScript. The original 'Code Interpreter on your machine'. Useful for automation tasks.

Cline

agent

OSS

VS Code extension agent — ~4M installs in 2026. Plan/Act mode, autonomous file edits with diff approval, terminal access. The leading open-source IDE agent.

vLLM

server

OSS

High-throughput serving engine. PagedAttention, continuous batching, prefix caching. Production default for self-hosted LLM APIs at scale.

Linux

Our rating: 4.8/5

Text Generation WebUI (oobabooga)

gui

OSS

The 'AUTOMATIC1111 of LLMs'. Kitchen-sink Gradio UI with multi-backend support and a big extension ecosystem.

LlamaIndex

orchestrator

OSS

Python/JS framework focused on RAG and document indexing. Cleaner than LangChain for retrieval-heavy use cases.

any

Our rating: 4.2/5

Unsloth

finetuner

OSS

2x faster QLoRA fine-tuning with hand-tuned Triton kernels. Free OSS for single-GPU; commercial Pro for multi-GPU.

Linux

Our rating: 4.6/5

AnythingLLM

gui

OSS

Document-oriented LLM frontend with workspaces. Connects to Ollama, LM Studio, OpenAI, Anthropic, etc. Strong document RAG.

Claude Code

agent

Anthropic's terminal-native coding agent. Tops SWE-bench Verified at 87.6% and SWE-bench Pro at 64.3% in 2026. Deep MCP integration, agentic file editing, and a $20/mo Pro tier are the standout signals.

Jan

gui

OSS

Open-source desktop ChatGPT alternative. Privacy-first, runs offline, supports Hugging Face import.

Aider

agent

OSS

Terminal-based AI pair programmer. Run in your project directory, describe a change, it edits files and creates meaningful git commits. Works with any LLM — local Ollama, Anthropic, OpenAI, etc.

Continue

agent

OSS

Open-source VS Code and JetBrains assistant. Configurable autocomplete + chat + agent modes. Strong with local Ollama backends.

Llamafile

runner

OSS

Mozilla's single-binary llama.cpp distribution. Download one file, run on any OS without dependencies.

Roo Code

agent

OSS

Cline fork that ships features faster — diff-based editing reduces per-task token cost ~30%. Multiple specialized modes (Architect, Code, Debug).

macOS

Linux

Windows

Codex CLI

agent

OSS

Open-source CLI client for the new Codex agent. Local CLI that orchestrates cloud Codex models against your file tree.

NVIDIA TensorRT-LLM

server

OSS

NVIDIA's optimized inference path for Hopper, Ada, and Blackwell. Compile your model once, serve at peak hardware speed.

Linux

Windows

Our rating: 4.3/5

OpenCode

agent

OSS

Open-source terminal coding agent built by the SST team. TUI-first, BYO LLM, MCP-compatible. A Claude-Code-style workflow without the Anthropic lock-in.

Axolotl

finetuner

OSS

YAML-config fine-tuning framework. Reference toolkit for the open fine-tuning community (Hermes, Dolphin, etc. all use it).

Linux

Our rating: 4.4/5

Text Generation Inference (TGI)

server

OSS

HuggingFace's production inference server. Slightly behind vLLM on raw throughput but tighter integration with the HF ecosystem.

Linux

Our rating: 4.2/5

Kilo Code

agent

OSS

VS Code agent — 1.5M users in 2026, supports 500+ models, charges zero markup over upstream API costs. Cline lineage with Roo Code's diff approach.

macOS

Linux

Windows

Pinokio

orchestrator

OSS

Browser-style app launcher for AI tools. One-click installs of ComfyUI, oobabooga, RVC, and many other AI apps.

KoboldCPP

gui

OSS

Single-file llama.cpp distribution focused on roleplay and creative writing. Bundles a web UI, image gen, and the Kobold API.

ExLlamaV2

runner

OSS

GPU-only inference library optimized for consumer NVIDIA cards. Fastest tokens-per-second on a single 24GB card for 30B models in EXL2 quant.

Linux

Windows

Our rating: 4.4/5

MLX-LM

runner

OSS

Apple's Metal-native ML framework's LLM runner. Now competitive with llama.cpp Metal on M-series silicon, with better long-context performance.

macOS

Our rating: 4.5/5

Sourcegraph Cody

agent

OSS

Sourcegraph's AI assistant. Strong at large-codebase context retrieval thanks to the underlying Sourcegraph index.

Hugging Face Hub CLI

quantizer

OSS

The CLI for the world's model hub. `hf download`, `hf upload`, model card editing.

any

Our rating: 4.5/5

OpenClaw Gateway

orchestrator

OSS

Open-source LLM gateway with multi-provider fallbacks. Sits between an agent and many LLM providers (Anthropic, OpenAI, Google, local Ollama) so you can fail over and load-balance.

Pi (Inflection AI)

agent

Inflection AI's consumer assistant — voice-first, conversational, designed for personal use rather than coding. Powered by Inflection-2.5.

Cursor

ide

Anysphere's AI-native IDE. Forks VS Code with Cursor Tab inline completion, agentic chat, and background agents. Best 'flow' for inline completion in 2026.

Claude Desktop

agent

Anthropic's official desktop app for Claude. Native MCP server support means you can plug in local file access, GitHub, and custom tools. Distinct from the Claude Code CLI.

macOS

Windows

Our rating: 4.4/5

OpenAI Codex

agent

OpenAI's 2025 coding agent (the new Codex, distinct from the deprecated 2021 model). Cloud task-runner pattern: hand it a multi-step task, it works in a sandbox and returns a PR.

LM Studio

gui

Polished desktop GUI for local LLMs. Built-in HuggingFace search, OpenAI-compatible local server, side-by-side conversations.

Windsurf (Codeium)

ide

Codeium's AI-native IDE (formerly known as Codeium). Cascade agent, supercomplete, and a generous free tier.

Droid (Factory)

agent

Factory's autonomous SWE agent. Operates over GitHub PRs, Slack, Linear. Targets the long-running multi-file change workflow.

Devin

agent

Cognition Labs' fully autonomous SWE agent. Cloud-only, browser interface, longest task horizons. Premium pricing.

web (browser)

Our rating: 4/5

Replit Agent 3

agent

Replit's full-stack scaffolder agent. Goes from prompt to deployed app on Replit's hosted runtime.

web (browser)

Our rating: 4.3/5

JetBrains AI Assistant

ide

JetBrains' first-party AI for IntelliJ, PyCharm, WebStorm, etc. Multi-LLM backend (OpenAI, Anthropic, Gemini, local).

Msty

gui

Cross-platform desktop client supporting local and cloud models in one window. Strong on knowledge-stack RAG.

GitHub Copilot

ide

GitHub's incumbent AI assistant. VS Code, JetBrains, Neovim integrations. Lost some inline-completion mindshare to Cursor and agentic mindshare to Claude Code, but still the easiest enterprise rollout via GitHub.