OpenHands
AI-driven development agent that completes engineering tasks end-to-end — branches, code, PRs. v1.6 added a Planning Mode that drafts a plan before executing. Local-LLM-friendly via Ollama, vLLM, and SGLang. The strongest open alternative to closed coding agents in 2026 (77.6% SWE-bench Verified).
What this tool actually is
OpenHands is the autonomous coding agent that, through 2024-2026, became the open-source default when teams said “we want a Claude Code alternative we can self-host.” Calling it “an LLM coding tool” — which is how vendor docs and most listings frame it — undersells two things at once. First, it's an autonomous task runner, not a chat interface: you describe a task, it plans, it edits files, it runs tests, it reports back. Second, it's an integration platform: MCP servers wire in, memory layers wire in, custom tools wire in — the agent loop is configurable in ways most coding-agent products aren't.
The layer it occupies in the stack:
- Below: an inference runtime (vLLM, SGLang, Ollama, or a cloud API) hosting the actual LLM weights, plus zero-or-more MCP servers (filesystem, git, postgres, etc.) plus an optional memory layer (Mem0, Letta).
- Above: the developer (or team) running tasks via CLI, the OpenHands UI, or an Open WebUI-style chat front door.
What it replaces in practice: hand-rolled scripts wrapping git apply + LLM completions, the manual paste-and-edit-and-run loop, and the “just call OpenAI from a Python script” first attempt that every team makes before realizing they need a real agent harness. The 2025-2026 cycle moved OpenHands from research project to operational infrastructure for serious self-hosted coding teams.
Who it is for. Solo developers running autonomous tasks on private codebases. Small teams who need self-hosted agent infrastructure (cloud APIs aren't politically or technically acceptable). Engineers who use it as the front door to MCP-driven workflows — workspace = repo, MCP servers wire in, the agent does real work. Who it is not for. Anyone who wants conversational chat without autonomous execution (use Open WebUI). Anyone whose workflow is surgical-edit-and-commit (use Aider — different paradigm). Anyone unwilling to monitor agent loops carefully — autonomous agents will go off the rails on novel tasks; supervision is non-optional.
Architecture
The mental model that makes OpenHands make sense:
``` ┌────────────────────────────────────────────────────────────────┐ │ OpenHands runtime │ │ │ │ ┌─────────────────────────────────────────────────────────┐ │ │ │ Planning loop │ │ │ │ - task ingest → plan draft → plan approval → execute │ │ │ │ - Planning Mode (v1.6+): plan shown to user first │ │ │ │ - Replan trigger: tool result deviates from plan │ │ │ └─────────────────────────┬───────────────────────────────┘ │ │ │ │ │ ┌─────────────────────────▼───────────────────────────────┐ │ │ │ Tool dispatcher │ │ │ │ - MCP servers (stdio + remote) │ │ │ │ - Built-in: shell, file edit, browser │ │ │ │ - Memory queries (Mem0 / Letta integration) │ │ │ └─────────────────────────┬───────────────────────────────┘ │ │ │ │ │ ┌─────────────────────────▼───────────────────────────────┐ │ │ │ LLM client │ │ │ │ - OpenAI-compatible (vLLM / SGLang / Ollama / cloud) │ │ │ │ - Streaming, tool-call protocol, retries │ │ │ └─────────────────────────┬───────────────────────────────┘ │ │ │ │ │ ┌─────────────────────────▼───────────────────────────────┐ │ │ │ Sandbox executor │ │ │ │ - Docker / chroot / native fork │ │ │ │ - File-system allowlisting via filesystem MCP │ │ │ │ - Git boundary enforcement via git MCP │ │ │ └─────────────────────────────────────────────────────────┘ │ └────────────────────────────────────────────────────────────────┘ ```
Three things to understand:
The planning loop is the differentiator. Most agent products run a tool-calling loop where the model decides each next step from local state. OpenHands runs a plan-first loop in v1.6+: the model drafts a multi-step plan before executing anything. Plans are inspectable, approvable, and replannable. This is what separates “agent that does autonomous work” from “LLM that calls tools.”
Tool dispatch is MCP-first. The dispatcher treats built-in tools (shell, file edit, browser) and MCP-provided tools (filesystem, git, postgres, custom) as equivalent. This means an OpenHands deployment can be radically extended without forking the codebase — drop in new MCP servers, the agent picks them up. It's the most operationally important architectural choice.
The sandbox executor is a real isolation boundary. Docker / chroot / native fork modes; allowlisted filesystem access; git boundary enforcement. The default Docker mode is the right pick for most users — if the agent goes off the rails, the blast radius is the container.
The serving layer on top is the OpenHands UI (browser-based, ships with the Docker image) plus an optional CLI (openhands run) for headless workflows. Both speak to the same runtime.
Local stack compatibility
OpenHands is runtime-agnostic by design — anything that exposes an OpenAI-compatible /v1 endpoint plugs in, with first-class tested integrations for vLLM, SGLang, Ollama, LM Studio, Anthropic API, and OpenAI API. The matrix above shows the eight backends we've actually tested it against, with the operator notes that matter when wiring each. The short version: vLLM is the production default for self-hosted; Ollama is the solo-developer default; Anthropic API is the right choice when capability ceiling matters more than privacy. Apple Silicon users running MLX-LM should expect ~30% throughput trail vs Linux/CUDA but otherwise identical wiring.
Real deployment paths
The four ways teams actually run OpenHands in 2026, ordered by operator skill required. The cards above show hardware + complexity at a glance; the prose here is operator-grade detail.
The solo developer path is where most readers start. docker run the OpenHands image, point it at Ollama on the same machine, drop into the UI, run a task. Total setup time is under 10 minutes if Ollama is already running. Constraint is hardware: a 16GB GPU with Ollama and 32GB system RAM gets you serious 13B-class models with comfortable context room.
The production workstation path is the /stacks/local-coding-agent canonical setup. OpenHands + vLLM + filesystem/git MCP + Mem0 on an RTX 4090. Multi-session memory; first-class MCP integration; 60-180 second end-to-end iteration on real bugfixes. The right deployment when one developer wants serious agent infrastructure on private hardware.
The team-shared path is the operationally tricky middle. Multiple users hit one OpenHands instance through Open WebUI. The architecture supports it but the per-user workspace isolation is younger than it should be — accidents where User A's task touches User B's workspace happen. Production-harden this with strict allowlisting and audit logs before promoting to a 10+ user setup.
The distributed cluster path is genuine production. OpenHands instances autoscale behind Ray Serve; vLLM or SGLang clusters serve the underlying LLM. Concurrent autonomous tasks at scale. This is where teams replacing $50K/month of Claude Code or Cursor licenses end up.
Resource usage and performance
Numbers to plan around:
- Idle memory for the OpenHands process: ~400-700 MB. Each active task adds ~100-200 MB.
- End-to-end task time on a real bugfix (read 5 files, edit 2, run tests): 60-180 seconds when paired with Qwen 2.5 Coder 32B on an RTX 4090 via vLLM. With Ollama instead, expect 90-240 seconds.
- Tool calls per task typical range: 5-20 for surgical edits; 20-50 for substantial refactors; 50-150 for whole-feature autonomous work.
- Token cost per task with cloud APIs: $0.05-0.30 for surgical edits, $0.50-3.00 for substantial work, $5-30 for whole-feature autonomous attempts. Self-hosted = effectively free at the per-token margin once the hardware is amortized.
- Memory query latency matters more than expected. Mem0 + LanceDB: 50-150ms. Letta with explicit paging: 100-300ms. Multiply by 5-15 retrievals per task. Slow memory adds up.
Honest scaling limit: a single OpenHands instance handles one user's concurrent task workload comfortably; 2-5 light users via the UI; past that, deploy multiple instances behind a router. The team-shared single-instance pattern wobbles past 5 active users.
Failure modes
The list of things that will go wrong in production, in rough order of how often we've seen them:
- Agent loops on plan revision. OpenHands keeps re-planning instead of executing. Usually a Planning Mode + headless mismatch. Set `plan_first = false` after the first session, or use the UI to approve plans explicitly.
- Filesystem MCP path-escape attempt. The allowlist is enforced. Symptom: the agent reports “permission denied” on files outside your repo. That's correct behavior — widen the allowlist deliberately if needed.
- Tool-call timeout on slow operations. Default 30s. Long-running tests (integration suites that boot a database) blow past it. Configure per-tool timeouts in the MCP config.
- Memory drift between sessions. Episodic memory says one thing; the actual repo / database state says another. The agent confidently reasons against stale knowledge. Always query MCP-git or MCP-postgres for ground truth before acting on episodic memory.
- Sandbox container resource exhaustion. Long agent loops consume disk + memory. Set Docker resource limits on the sandbox container.
- OpenAI-compatibility tool-call format mismatch. Some local runtimes (older llama.cpp, some Ollama versions) emit tool calls in slightly non-standard JSON. OpenHands' parser is forgiving but breaks occasionally. Pin the runtime version that works.
- MCP server hang. A malformed MCP server can hang waiting on stdin/stdout. OpenHands' stdio transport doesn't timeout aggressively enough by default. Wrap MCP server processes in a watchdog that restarts them on stall.
- Multi-LoRA inference confusion. When the underlying vLLM serves multiple LoRA adapters, OpenHands occasionally sends requests to the wrong adapter under high concurrency. Pin one model per OpenHands instance.
How it compares
vs Claude Code. Claude Code is the polished closed-source flagship; OpenHands is the open-source equivalent. Claude Code has stronger reasoning depth (Anthropic's underlying models); OpenHands has self-hostability and configurable MCP. Pick Claude Code for capability ceiling; OpenHands for privacy + control.
vs OpenClaw. OpenClaw is the 2026 hype magnet; OpenHands is the longer-track-record alternative. OpenClaw is faster-moving but less battle-tested; OpenHands is more stable. Pick OpenHands for production today; experiment with OpenClaw for the next-cycle features.
vs Aider. Different paradigms. Aider is git-integrated CLI editor (each session = one focused commit). OpenHands is autonomous task runner (each session = a multi-step plan). Pick Aider for surgical edits where you want the agent under tight control; OpenHands for autonomous work where you want the agent to drive.
vs Goose. Both are MCP-first open-source agents. Goose is Block's extension-platform play; OpenHands is the broader autonomous-coding-agent play. They overlap on MCP heaviness; Goose wins on extension ergonomics, OpenHands on raw autonomous task quality.
vs Cline / Continue. VS Code-native agents. Different category — they live in the editor; OpenHands lives in a browser UI or CLI. Pick Cline/Continue for “help me edit while I'm coding”; OpenHands for “go do this task autonomously while I do something else.”
Best use cases
Where OpenHands is genuinely the right answer:
- Bug triage and fix on private codebases — read files, find root cause, propose patch, run tests.
- Refactoring at scale — “rename this concept across all 200 files; verify tests pass.”
- Test-suite stabilization — agent investigates flaky tests, proposes fixes, iterates.
- Documentation generation from code — agent reads modules, drafts docs, you review.
- Multi-step autonomous work on stable codebases — paired with Mem0 or Letta for cross-session continuity.
Where OpenHands is the wrong answer:
- Surgical commits with tight control — use Aider.
- Conversational chat without autonomous execution — use Open WebUI.
- IDE-resident assistance — use Cline or Continue.
- Greenfield projects with no existing context — agents struggle without grounding signal.
- Anyone unwilling to monitor agent loops — autonomous agents go off the rails; supervision is non-optional.
Verdict
OpenHands is the default open-source autonomous coding agent for self-hosted deployment in 2026. Planning Mode in v1.6+ closed the “agent loops without making progress” gap that plagued earlier autonomous-agent products; the MCP-first tool dispatcher makes it the cleanest extension surface in the category; and the sandbox executor finally treats “the agent might do something destructive” as a real isolation boundary rather than a hope. The /stacks/local-coding-agent recipe pulls this together with a vLLM runtime, Mem0 memory, and filesystem/git MCP — that's the stack we recommend by default.
The honest tradeoffs: it's an autonomous agent, which means you have to monitor it; team-shared single-instance deployment is younger than it should be; and the iteration speed varies dramatically with the underlying model + runtime quality. None of those are reasons to default away — they're reasons to deploy the production-workstation path with vLLM + a strong coding model.
Buy / use this if you want autonomous agent work on a private codebase and you have the operational maturity to monitor agent loops. Skip it if your workflow is surgical-edit-only, you're unwilling to set up MCP servers, or your team is large enough that you need real multi-tenant isolation that OpenHands doesn't yet provide.
Rating math: 4.7/5 — the open-source autonomous-coding-agent crown is genuinely OpenHands' to lose in May 2026. The half-point lost is for the team-shared deployment surface, which is the real next operational gap.
Sources
- OpenHands GitHub — release notes, architecture docs, MCP integration history.
- OpenHands documentation — planning loop, tool dispatch, sandbox modes.
Related
- /stacks/local-coding-agent — the canonical OpenHands production deployment recipe
- /stacks/memory-enabled-agent — the long-horizon variant with Mem0 + Letta integration
- vLLM — the production runtime pairing
- Mem0, Letta — the canonical memory-layer pairings
- /tools/mcp-server-filesystem, git, postgres — the MCP servers OpenHands deployments wire in
- Claude Code, OpenClaw — the closest competitors
- Aider, Cline — different-paradigm alternatives
- /systems/mcp — the protocol layer
- /maps/local-ai-agents-2026 — where OpenHands sits in the agent ecosystem
- /authors/fred-oline — about the author
| Status | Runtime / Stack | Notes |
|---|---|---|
| Excellent | vLLM | The canonical production pairing. Continuous batching matters because OpenHands makes 5-15 tool calls per task; prefix caching keeps the system prompt resident across the agent loop. Used in /stacks/local-coding-agent. |
| Excellent | SGLang | Wins over vLLM on stable-system-prompt agent loops (>50% prefix-cache hit rate). The wall-clock advantage compounds across the loop. Drop in via OpenAI-compatible endpoint with no adapter. |
| Good | Ollama | Default for single-developer setups. Loses concurrency benefits vs vLLM but wins on time-to-first-token-after-zero-config. Pick Ollama for solo workflows; pick vLLM the moment a second user shows up. |
| Good | LM Studio | OAI-compatible local server pairs cleanly with OpenHands. Good when you want a GUI for model swapping alongside the autonomous agent. |
| Excellent | Anthropic API | Cloud path — Claude Sonnet / Haiku as the model. Higher capability ceiling than open models for the most complex tasks; the privacy tradeoff is the main reason teams stay local. |
| Good | OpenAI API | Works fine; the official Python SDK pattern. Cost grows fast on agent loops with 10+ tool calls per task. Most teams who tried this migrated to local or Anthropic by 2026. |
| Good | MLX-LM | Apple Silicon path via the OAI-compatible bridge. Throughput trails Linux/CUDA by ~30%; ergonomics are excellent. Pairs with /stacks/apple-silicon-ai. |
| Limited | TensorRT-LLM | Doable through Triton's OpenAI shim. Operationally heavy; the per-model recompile is friction OpenHands' rapid iteration doesn't tolerate well. Use when latency matters more than iteration speed. |
Solo developer, single-machine
trivialOne workstation, OpenHands + Ollama + filesystem MCP. Solo developer iterating on a single repo. The fastest path from zero to a working autonomous agent — under 10 minutes if Ollama is already installed.
Production workstation, vLLM-backed
moderateOpenHands + vLLM + MCP filesystem/git/postgres + Mem0 + RTX 4090. The /stacks/local-coding-agent canonical setup. Multi-session agent runs with persistent memory. Real production deployment for a single power user or a small team.
Team-shared, Open WebUI multi-tenant front door
involvedMultiple users hit one OpenHands instance via Open WebUI as the chat surface. vLLM as the runtime. Per-user workspace isolation handled by OpenHands' workspace primitive. The right pattern for a 5-15 person team that wants shared agent infrastructure.
Distributed cluster, Ray Serve orchestrated
expertOpenHands instances autoscale behind Ray Serve, talking to vLLM/SGLang clusters. The /stacks/distributed-inference-homelab pattern extended with agent loops. For genuine production with concurrent autonomous tasks at scale.
Stack & relationships
How OpenHands relates to other entries in the catalog — recommended pairings, alternatives, dependencies, and edges to avoid. Each edge carries a one-line operator note from our editorial team.
Recommended stack
- Pairs withMem0 (agent memory API)
The default memory pairing for OpenHands. 20 lines of config; works out of the box. The /stacks/local-coding-agent stack uses this pairing.
- Commonly deployed withvLLM
vLLM is the default inference engine in the canonical OpenHands stack. Continuous batching pays for itself when the agent makes 10+ tool calls per task.
- Commonly deployed withSGLang
Pick SGLang over vLLM in OpenHands when traffic includes shared system prompts (>50% prefix-cache hit rate). The wall-clock advantage compounds across the agent loop.
- Pairs withLetta (memory framework)
Letta provides the persistent memory tier OpenHands lacks natively. Pair via OpenHands' memory provider config. Heavier wiring than Mem0 but stronger long-horizon-task behavior.
- Pairs withZep (memory platform)
Zep memory provider integration is cleaner on OpenHands than on Goose or Aider. Picks up agent decisions across sessions automatically.
- Commonly deployed withMCP Brave Search Server
Brave Search MCP is the default web-research path in agentic OpenHands setups. Privacy-respecting; sufficient API quota tier for typical agent use.
- Pairs withMCP Sequential Thinking
Sequential Thinking MCP gives OpenHands a structured reasoning scratchpad outside the main context window. Useful in long-horizon tasks where the model would lose track of an evolving plan.
- Pairs withPhoenix (Arize AI)
Phoenix instruments OpenHands tool-calls + LLM spans via OpenInference. Drop-in for trace + eval workflows.
Works with
- Integrates withMCP Filesystem Server
Filesystem MCP is non-optional for OpenHands — it's how the agent reads and writes project files. Allowlist limits blast radius.
- Integrates withMCP Git Server
Git MCP gives OpenHands repo metadata awareness — what changed, when, why. Pairs naturally with filesystem MCP for full repo grounding.
- Integrates withModel Context Protocol (MCP)
MCP is one of several tool transports OpenHands speaks — the protocol is supported but not the only path.
- Integrates withMCP PostgreSQL Server
Postgres MCP exposes structured-knowledge memory to OpenHands. The /stacks/memory-enabled-agent recipe wires this path.
Alternatives
- Competes withOpenClaw
Both are open-source coding agents. OpenClaw exploded in early 2026; OpenHands has the longer track record and Planning Mode in v1.6. Pick by ecosystem fit.
- Competes withAider
OpenHands is full agent loop; Aider is git-integrated CLI editor. Different paradigms — Aider for surgical edits, OpenHands for autonomous tasks.
- Competes withOpenClaw
Both are open-source autonomous coding agents. OpenClaw is faster-moving and the 2026 hype magnet; OpenHands has the longer track record. Pick OpenHands for stability, OpenClaw for the latest features.
- Competes withGoose
Both are open-source agents. Goose is MCP-first by design; OpenHands has broader tool-transport support. Pick Goose if MCP is non-negotiable; OpenHands for flexibility.
- Alternative toAider
Different paradigms. Aider is git-integrated CLI editor; OpenHands is autonomous task agent. Pick Aider for surgical commits, OpenHands for autonomous work.
Featured in these stacks
The L3 execution stacks that pick this tool as a recommended component, with the one-line note explaining the role it plays in each.
- Stack · L3·Workstation tier·Role: Coding agent (the planning + execution loop)Build a local coding-agent stack (May 2026)
OpenHands v1.6 ships Planning Mode (drafts a plan before execution) and has the longest production track record in the open-source category. Pick OpenHands over Aider when you want autonomous task execution; pick Aider for surgical git-integrated edits.
- Stack · L3·Workstation tier·Role: Coding agent + planning loopBuild a memory-enabled local agent stack (May 2026)
OpenHands over Goose / Aider for memory-enabled workflows: Planning Mode pairs naturally with persistent memory (the plan from session N becomes context for session N+1), and the MCP integration is the strongest in the open-source category. Goose is competitive but Mem0 integration is cleaner on OpenHands.
- Stack · L3·Workstation tier·Role: Coding agent (offline-verified)Build a fully offline coding stack (May 2026)
OpenHands has the cleanest offline path of the autonomous-agent leaders. Docker container + filesystem MCP + provider-abstracted memory all work without internet once dependencies are pre-staged. OpenClaw works too but its faster release cadence makes the dependency-pinning audit harder.
Pros
- MIT license — free forever
- Local-LLM support out of the box (Ollama, vLLM, SGLang)
- Native GitHub / GitLab / CI integrations
- Planning Mode reviews work before executing
Cons
- Setup requires Docker for sandboxed execution
- Quality varies sharply by underlying LLM choice
- Still trails Claude Code on SWE-bench Pro
Compatibility
| Operating systems | macOS Linux Windows |
| GPU backends | n/a (uses local Ollama / vLLM / cloud) |
| License | Open source · free (OSS) — pay only for chosen LLM provider |
Runtime health
Operator-grade signals on how actively OpenHands is being maintained, how fresh its measurements are, and what failure classes operators have flagged. Every label below is anchored to a real date or count — we never infer maintainer activity we can't show.
Release cadence
Derived from the most recent editorial signal on this row.
8 days since last refresh · source: lastUpdated
Benchmark freshness
How recent the editorial measurements on this runtime are.
No editorial benchmarks for this runtime yet.
Community reproduction
Submissions that match an editorial measurement on similar hardware.
No community reproductions on file yet.
Ecosystem stability
Editorial rating from RunLocalAI — qualitative, not measured.
Get OpenHands
Frequently asked
Is OpenHands free?
What operating systems does OpenHands support?
Does OpenHands need a GPU?
Reviewed by RunLocalAI Editorial. See our editorial policy for how we evaluate tools.
Related — keep moving
Verify OpenHands runs on your specific hardware before committing money.