Letta (memory framework)
Agent memory framework that models memory like an operating system. Main context = RAM, archival storage = disk; the agent itself decides when to page. Originally MemGPT, now Letta. Model-agnostic (Anthropic, OpenAI, Ollama, Vertex), with REST API + dev environment for stateful agent services.
Overview
Agent memory framework that models memory like an operating system. Main context = RAM, archival storage = disk; the agent itself decides when to page. Originally MemGPT, now Letta. Model-agnostic (Anthropic, OpenAI, Ollama, Vertex), with REST API + dev environment for stateful agent services.
Setup guidance
Install via pip: pip install letta. Requires Python 3.10+. Letta (formerly MemGPT) is a framework for building stateful LLM agents with persistent memory and tool use. Start Letta server: letta server. This starts the REST API at http://localhost:8283. Create an agent: letta create-agent --name my-agent --model gpt-4o. For local models: letta create-agent --name local-agent --model llama3.2 --llm-endpoint http://localhost:11434/v1 --llm-endpoint-type openai. The agent persists its memory (conversation history, core memories, archival memories) to a SQLite database. Chat via CLI: letta chat --agent my-agent. The Letta SDK provides programmatic access: from letta import Letta; client = Letta(base_url="http://localhost:8283"); agent = client.agents.get("agent-id"); response = client.agents.messages.create(agent_id=agent.id, messages=[{"role": "user", "content": "Hello"}]). First run: letta server auto-creates the SQLite DB, ~5 seconds start time. Time-to-first-agent: ~30 seconds including model prompt. Verify: letta chat --agent my-agent and send a message — the agent responds and persists memory to SQLite.
Workload fit
Best for: persistent AI companions that remember users and context across sessions, customer support agents that accumulate knowledge about accounts over time, personal AI assistants that grow their understanding of the user, research agents that maintain a growing knowledge graph from literature scanning, any application where "the agent should remember what we talked about yesterday" is a requirement. Not suited for: stateless single-turn Q&A (use direct LLM API calls), RAG over static document collections (use LlamaIndex), applications where memory-as-a-service is preferred over agent framework (use Mem0 or Zep), latency-sensitive real-time systems (memory paging adds 1–3 seconds per archival memory access), environments where SQLite doesn't meet persistence requirements (use the Postgres backend).
Alternatives
Use Letta when you need LLM agents with long-term persistent memory — an agent that remembers conversations across days, maintains a growing knowledge base, and self-edits its own memory. Letta's virtual context management (OS-inspired paging of memories between context window and persistent storage) is unique among open-source agent frameworks. Switch to Mem0 when you want memory as an API layer for any LLM application rather than a full agent framework — Mem0 is a memory service, Letta is a memory-native agent platform. Use LangChain agents when you need broader tool ecosystem integration without the memory primitives. Use CrewAI or AutoGen for multi-agent orchestration. Letta's strength: the memory architecture — it treats LLM context as an OS treats RAM and uses SQLite/Postgres as "disk" for paging memories in and out. Its weakness: heavier than simpler memory solutions (Mem0, Zep) and the full agent framework adds complexity when you only need memory.
Troubleshooting + when to switch
Problem: letta server fails with "address already in use" on port 8283. Fix: Change the port: letta server --port 8284. If using the Letta client, specify: client = Letta(base_url="http://localhost:8284"). The REST API and admin UI both default to 8283. Problem: Agent memory doesn't persist between sessions with local models. Fix: Letta's memory management requires the model to respond correctly to function-calling prompts (for memory read/write/edit tools). Smaller local models may not implement tool calling reliably. The memory tools (core_memory_append, archival_memory_insert, core_memory_replace) are embedded in the system prompt as function definitions — if the model doesn't invoke them, memory doesn't update. Test with a tool-calling capable model (Llama 3.1 8B function-calling variant, Mistral 7B v0.3, or Qwen 2.5 7B). Problem: letta chat exits with "Agent not found." Fix: Agent state is stored in the SQLite database (~/.letta/letta.db by default). If you change the database or the server was reset, agents are lost. Run letta list-agents to see available agents. The server instance manages a single database — running multiple server instances with different DB paths isolates agents.
Stack & relationships
How Letta (memory framework) relates to other entries in the catalog — recommended pairings, alternatives, dependencies, and edges to avoid. Each edge carries a one-line operator note from our editorial team.
Recommended stack
- Pairs withOpenHands
Letta provides the persistent memory tier OpenHands lacks natively. Pair via OpenHands' memory provider config. Heavier wiring than Mem0 but stronger long-horizon-task behavior.
- Pairs withvLLM
Letta drives an inference engine via OpenAI-compatible API. vLLM's continuous batching matters because Letta makes 5-15 retrieval-then-generate calls per task. Same wiring pattern as Mem0.
Alternatives
- Competes withMem0 (agent memory API)
Mem0 is drop-in agent memory; Letta is OS-style explicit memory management. Pick Mem0 for fast wiring; Letta when you need to reason about memory state explicitly.
- Alternative toMem0 (agent memory API)
Letta is OS-style explicit memory management (paging, archival, working memory split); Mem0 is drop-in vector memory. Pick Letta when you need deterministic memory behavior; Mem0 when you want fast wiring.
- Competes withZep (memory platform)
Both target long-horizon agent memory. Letta is explicit memory hierarchy; Zep is temporal knowledge graph. Different mental models — pick by whether memory state is something you want to inspect or something you want to query.
- Alternative toMCP Memory Server
MCP Memory is JSON-on-disk knowledge-graph memory — entry-tier. Letta is OS-style explicit management. Pick MCP Memory for trivial setup; Letta for production-grade memory.
- Alternative toMem0 (agent memory API)
Different abstractions for the same need. Mem0: drop-in API with implicit memory. Letta: explicit OS-like memory hierarchy. The right choice depends on whether you want to control memory state or just have it work.
Pros
- OS-style memory architecture is uniquely suited to long-running agents
- Model-agnostic — pairs cleanly with local Ollama
- Mature dev environment (ADE) for inspecting agent state
- Open-source under Apache 2.0
Cons
- Steeper learning curve than drop-in Mem0 API
- Requires explicit memory-management tool calls in agent loop
- Less ergonomic for stateless one-shot agents
Compatibility
| Operating systems | macOS Linux Windows |
| GPU backends | n/a |
| License | Open source · free (OSS) + managed cloud option |
Runtime health
Operator-grade signals on how actively Letta (memory framework) is being maintained, how fresh its measurements are, and what failure classes operators have flagged. Every label below is anchored to a real date or count — we never infer maintainer activity we can't show.
Release cadence
Derived from the most recent editorial signal on this row.
8 days since last refresh · source: lastUpdated
Benchmark freshness
How recent the editorial measurements on this runtime are.
No editorial benchmarks for this runtime yet.
Community reproduction
Submissions that match an editorial measurement on similar hardware.
No community reproductions on file yet.
Ecosystem stability
Editorial rating from RunLocalAI — qualitative, not measured.
Get Letta (memory framework)
Frequently asked
Is Letta (memory framework) free?
What operating systems does Letta (memory framework) support?
Does Letta (memory framework) need a GPU?
Reviewed by RunLocalAI Editorial. See our editorial policy for how we evaluate tools.
Related — keep moving
Verify Letta (memory framework) runs on your specific hardware before committing money.