Letta (memory framework)

Agent memory framework that models memory like an operating system. Main context = RAM, archival storage = disk; the agent itself decides when to page. Originally MemGPT, now Letta. Model-agnostic (Anthropic, OpenAI, Ollama, Vertex), with REST API + dev environment for stateful agent services.

By Fredoline Eruo·Last verified Jun 12, 2026·18,000 GitHub stars

Overview

Setup guidance

Install via pip: pip install letta. Requires Python 3.10+. Letta (formerly MemGPT) is a framework for building stateful LLM agents with persistent memory and tool use. Start Letta server: letta server. This starts the REST API at http://localhost:8283. Create an agent: letta create-agent --name my-agent --model gpt-4o. For local models: letta create-agent --name local-agent --model llama3.2 --llm-endpoint http://localhost:11434/v1 --llm-endpoint-type openai. The agent persists its memory (conversation history, core memories, archival memories) to a SQLite database. Chat via CLI: letta chat --agent my-agent. The Letta SDK provides programmatic access: from letta import Letta; client = Letta(base_url="http://localhost:8283"); agent = client.agents.get("agent-id"); response = client.agents.messages.create(agent_id=agent.id, messages=[{"role": "user", "content": "Hello"}]). First run: letta server auto-creates the SQLite DB, ~5 seconds start time. Time-to-first-agent: ~30 seconds including model prompt. Verify: letta chat --agent my-agent and send a message — the agent responds and persists memory to SQLite.

Workload fit

Best for: persistent AI companions that remember users and context across sessions, customer support agents that accumulate knowledge about accounts over time, personal AI assistants that grow their understanding of the user, research agents that maintain a growing knowledge graph from literature scanning, any application where "the agent should remember what we talked about yesterday" is a requirement. Not suited for: stateless single-turn Q&A (use direct LLM API calls), RAG over static document collections (use LlamaIndex), applications where memory-as-a-service is preferred over agent framework (use Mem0 or Zep), latency-sensitive real-time systems (memory paging adds 1–3 seconds per archival memory access), environments where SQLite doesn't meet persistence requirements (use the Postgres backend).

Alternatives

Use Letta when you need LLM agents with long-term persistent memory — an agent that remembers conversations across days, maintains a growing knowledge base, and self-edits its own memory. Letta's virtual context management (OS-inspired paging of memories between context window and persistent storage) is unique among open-source agent frameworks. Switch to Mem0 when you want memory as an API layer for any LLM application rather than a full agent framework — Mem0 is a memory service, Letta is a memory-native agent platform. Use LangChain agents when you need broader tool ecosystem integration without the memory primitives. Use CrewAI or AutoGen for multi-agent orchestration. Letta's strength: the memory architecture — it treats LLM context as an OS treats RAM and uses SQLite/Postgres as "disk" for paging memories in and out. Its weakness: heavier than simpler memory solutions (Mem0, Zep) and the full agent framework adds complexity when you only need memory.

Troubleshooting + when to switch

Problem: letta server fails with "address already in use" on port 8283. Fix: Change the port: letta server --port 8284. If using the Letta client, specify: client = Letta(base_url="http://localhost:8284"). The REST API and admin UI both default to 8283. Problem: Agent memory doesn't persist between sessions with local models. Fix: Letta's memory management requires the model to respond correctly to function-calling prompts (for memory read/write/edit tools). Smaller local models may not implement tool calling reliably. The memory tools (core_memory_append, archival_memory_insert, core_memory_replace) are embedded in the system prompt as function definitions — if the model doesn't invoke them, memory doesn't update. Test with a tool-calling capable model (Llama 3.1 8B function-calling variant, Mistral 7B v0.3, or Qwen 2.5 7B). Problem: letta chat exits with "Agent not found." Fix: Agent state is stored in the SQLite database (~/.letta/letta.db by default). If you change the database or the server was reset, agents are lost. Run letta list-agents to see available agents. The server instance manages a single database — running multiple server instances with different DB paths isolates agents.

Stack & relationships

How Letta (memory framework) relates to other entries in the catalog — recommended pairings, alternatives, dependencies, and edges to avoid. Each edge carries a one-line operator note from our editorial team.

Letta (memory framework) ↔ ecosystem

Recommended stack

Pairs with
OpenHands
Letta provides the persistent memory tier OpenHands lacks natively. Pair via OpenHands' memory provider config. Heavier wiring than Mem0 but stronger long-horizon-task behavior.
Pairs with
vLLM
Letta drives an inference engine via OpenAI-compatible API. vLLM's continuous batching matters because Letta makes 5-15 retrieval-then-generate calls per task. Same wiring pattern as Mem0.

Alternatives

Competes with
Mem0 (agent memory API)
Mem0 is drop-in agent memory; Letta is OS-style explicit memory management. Pick Mem0 for fast wiring; Letta when you need to reason about memory state explicitly.
Alternative to
Mem0 (agent memory API)
Letta is OS-style explicit memory management (paging, archival, working memory split); Mem0 is drop-in vector memory. Pick Letta when you need deterministic memory behavior; Mem0 when you want fast wiring.
Competes with
Zep (memory platform)
Both target long-horizon agent memory. Letta is explicit memory hierarchy; Zep is temporal knowledge graph. Different mental models — pick by whether memory state is something you want to inspect or something you want to query.
Alternative to
MCP Memory Server
MCP Memory is JSON-on-disk knowledge-graph memory — entry-tier. Letta is OS-style explicit management. Pick MCP Memory for trivial setup; Letta for production-grade memory.
Alternative to
Mem0 (agent memory API)
Different abstractions for the same need. Mem0: drop-in API with implicit memory. Letta: explicit OS-like memory hierarchy. The right choice depends on whether you want to control memory state or just have it work.

Pros

OS-style memory architecture is uniquely suited to long-running agents
Model-agnostic — pairs cleanly with local Ollama
Mature dev environment (ADE) for inspecting agent state
Open-source under Apache 2.0

Cons

Steeper learning curve than drop-in Mem0 API
Requires explicit memory-management tool calls in agent loop
Less ergonomic for stateless one-shot agents

Compatibility

Operating systems	macOS Linux Windows
GPU backends	n/a
License	Open source · free (OSS) + managed cloud option

Runtime health

Operator-grade signals on how actively Letta (memory framework) is being maintained, how fresh its measurements are, and what failure classes operators have flagged. Every label below is anchored to a real date or count — we never infer maintainer activity we can't show.