What's the best coding agent for local models (Ollama / llama.cpp)?

Reviewed May 15, 20262 min read
coding-agentsaiderclinecontinuetabby

The answer

One paragraph. No hedging beyond what the data actually warrants.

Five real options, ranked by my use:

1. Aider — terminal-native, git-aware, surgical. Reads code, proposes edits as git diffs, surgical edit-format that nails apply-cleanliness even on mid-tier local models. The killer app on the CLI side. Sweet spot: Qwen 2.5 Coder 32B on a 24GB GPU.

2. Cline — VS Code extension that runs a full agent loop locally. Plan → read → propose → ask permission → write → run → verify. Excellent permission UX. First-class Ollama support. Heavier on tokens than Aider — local models with weak context handling can struggle.

3. Continue — autocomplete + chat for VS Code and JetBrains. Open-source rival to Cursor / Copilot. Configurable to use Ollama / llama.cpp / vLLM. Default config nudges you toward local. JetBrains support is on par with VS Code — rare in this space.

4. Tabby — self-hosted coding-agent server with SSO, audit logs, team dashboards. Enterprise-leaning. Pick this when you need to deploy local AI to a team of 20+ and prove what was generated by whom.

5. Twinny — minimal-surface VS Code extension purpose-built for Ollama. Tighter integration than Continue for the autocomplete-only case, lower latency. Good "just give me Copilot but local" pick.

Model pairing matters more than the agent:

  • 24GB GPU + Qwen 2.5 Coder 32B Q4_K_M → Aider / Cline / Continue all work well.
  • 16GB GPU + DeepSeek Coder 6.7B Q4_K_M (FIM) → Twinny / Continue's autocomplete shines, agent-mode struggles.
  • 12GB GPU + Qwen 2.5 Coder 7B Q4_K_M → autocomplete only; agent loops will fight you.

The misconception: "I'll use Cursor with my local backend." Cursor's local-backend support has historically been fragile + cloud-required even with local routing. Pick a native-local agent instead.

Where we got the numbers

All five agents have full editorial pages in /apps with hands-on verdicts. Model pairing thresholds from community runlocalai-bench submissions + my own runs May 2026.

Found this via a forum search? Bookmark the URL — we update these pages as new data lands. Have a question that should live here? Open a GitHub issue.