Mem0 (agent memory API)

Drop-in memory layer for LLM agents. Vector + graph memory variants (Mem0g) — the graph variant builds a directed labeled knowledge graph alongside the vector store, with conflict detection on contradictory facts. Leads the 2026 agent-memory benchmarks at 68.4% LLM Score on multi-hop questions. Works with any LLM, including local Ollama models.

By Fredoline Eruo·Last verified Jun 12, 2026·28,000 GitHub stars

Overview

Setup guidance

Install via pip: pip install mem0ai. Requires Python 3.10+. Mem0 is a memory layer for LLM applications — it stores, retrieves, and updates personalized user memories without building the agent infrastructure yourself. Quick start: from mem0 import Memory; m = Memory(); m.add("I like pizza and live in New York", user_id="alice"); results = m.search("Where does Alice live?", user_id="alice"). The Memory class auto-configures with an in-memory vector store for development. For production: set MEM0_API_KEY to use the managed cloud service, or configure a local vector store (Qdrant, Chroma) in config.yaml. Mem0 auto-extracts memories from conversation messages: messages = [{"role": "user", "content": "I'm a vegetarian"}, {"role": "assistant", "content": "Got it, I'll remember that"}]; m.add(messages, user_id="alice") — it extracts the vegetarian preference automatically. Serve as a REST API: from mem0 import Memory; from fastapi import FastAPI wrap endpoints around m.add(), m.search(), m.get_all(). First run: ~30 seconds for package install and model download (default: all-MiniLM-L6-v2 for embeddings, GPT-4o-mini for memory extraction). Verify: run the search example above — it returns relevant memories. Time-to-first-memory: ~5 seconds including embedding computation.

Workload fit

Best for: adding user-level memory to LLM chatbots and applications without building memory infrastructure, personalization features where the LLM should remember user preferences across sessions, customer support bots that recall user history and context, AI companions that learn about the user over time, any application where "the chat should remember what the user said last week" is a product requirement. Not suited for: document-level RAG over files and databases (use LlamaIndex), full autonomous agents with memory-driven behavior (use Letta), applications requiring memory without API call overhead for extraction (build direct vector store + embedding integration), strict data sovereignty requirements (default extraction model is OpenAI-hosted), applications where memory should be graph-structured rather than text-embedded.

Alternatives

Use Mem0 when you need memory as a service — add user_id to your API calls and Mem0 handles storage, retrieval, deduplication, and memory update without you building a memory infrastructure. It's the lightest way to add user-level memory to any LLM application. Switch to Letta when you need a full agent framework with memory as one component — Letta's OS-inspired memory paging is more sophisticated but more opinionated. Use Zep for an alternative memory service with built-in conversation summarization and graph memory. Use LlamaIndex when memory is for document retrieval (RAG) not user preferences — Mem0 is user-centric memory, LlamaIndex is document-centric. Build your own memory layer with a vector DB when you need maximum control and minimal dependency. Mem0's strength: the extraction LLM automatically identifies what's worth remembering from conversations — you don't write memory extraction logic. Its weakness: the extraction model uses an API call (OpenAI by default), adding latency and cost to every .add() call.

Troubleshooting + when to switch

Problem: Memory().add() hangs or fails silently. Fix: Mem0's default .add() sends messages to OpenAI for memory extraction (defaults to gpt-4o-mini). If OPENAI_API_KEY is not set, the call fails silently — set export OPENAI_API_KEY=sk-.... Or switch to a local model for extraction: configure "custom_fact_extraction_model" in the Memory config to use a local provider. Problem: Memory search returns irrelevant or empty results. Fix: Mem0's search uses semantic similarity. If memories are too generic, the embedding distance may not distinguish them. Check that filters={"user_id": "alice"} is applied — without user_id filtering, results mix across users. Adjust threshold=0.3 (lower = more results, higher = more relevant) in .search(). Problem: Duplicate or contradictory memories accumulate. Fix: Mem0's deduplication checks for near-duplicate memories at .add() time based on embedding similarity. If contradictory memories enter (e.g., "I live in New York" then "I live in Boston"), Mem0 doesn't auto-resolve — it stores both. Use m.update(memory_id, "I live in Boston") to manually update. For auto-resolution, implement a periodic memory reconciliation pipeline on your side.

Stack & relationships

How Mem0 (agent memory API) relates to other entries in the catalog — recommended pairings, alternatives, dependencies, and edges to avoid. Each edge carries a one-line operator note from our editorial team.

Mem0 (agent memory API) ↔ ecosystem

Recommended stack

Pairs with
OpenHands
The default memory pairing for OpenHands. 20 lines of config; works out of the box. The /stacks/local-coding-agent stack uses this pairing.
Pairs with
Claude Code
Mem0 hooks into Claude Code's MCP layer for cross-session memory. The integration is community-maintained; works but expect occasional config churn.

Works with

Works with
LanceDB
Mem0's default vector backend. LanceDB's embedded architecture pairs naturally with Mem0's single-process design — no additional service to firewall.
Works with
Qdrant
Mem0's production-tier vector backend. Switch from LanceDB to Qdrant when memory store grows past ~500K vectors per agent.

Alternatives

Competes with
Letta (memory framework)
Mem0 is drop-in agent memory; Letta is OS-style explicit memory management. Pick Mem0 for fast wiring; Letta when you need to reason about memory state explicitly.
Competes with
Zep (memory platform)
Mem0 emphasises drop-in API; Zep emphasises temporal knowledge-graph memory. Different mental models — pick by whether you want graph traversal or vector retrieval.
Alternative to
MCP Memory Server
MCP Memory is JSON-on-disk knowledge-graph memory — entry-tier. Mem0 is a richer drop-in API. Pick MCP Memory for trivial setup; Mem0 for production-grade memory.
Alternative to
Letta (memory framework)
Letta is OS-style explicit memory management (paging, archival, working memory split); Mem0 is drop-in vector memory. Pick Letta when you need deterministic memory behavior; Mem0 when you want fast wiring.
Alternative to
Zep (memory platform)
Zep's temporal-graph approach handles 'what did Bob decide three sessions ago and why' better than Mem0's flat vector retrieval. Trade slower lookup for stronger multi-hop reasoning.
Alternative to
Letta (memory framework)
Different abstractions for the same need. Mem0: drop-in API with implicit memory. Letta: explicit OS-like memory hierarchy. The right choice depends on whether you want to control memory state or just have it work.

Featured in these stacks

The L3 execution stacks that pick this tool as a recommended component, with the one-line note explaining the role it plays in each.

Stack · L3·Workstation tier·Role: Persistent memory (codebase context across sessions)
Build a local coding-agent stack (May 2026)
Mem0 over Letta or Zep for this stack: dropping a memory layer into OpenHands takes 20 lines of config; Letta's OS-style explicit memory management is overkill for a single-user coding agent; Zep's temporal knowledge graph is strong but slower to wire.
Stack · L3·Workstation tier·Role: Episodic + semantic memory layer
Build a memory-enabled local agent stack (May 2026)
Mem0 over Letta for the default memory pick: drop-in API, less ceremony, faster to wire. Letta wins when you need OS-style explicit memory management (paging in/out memory blocks for long-horizon tasks) — promote to Letta only when you've outgrown Mem0's memory model.
Stack · L3·Workstation tier·Role: Memory (local-only via LanceDB)
Build a fully offline coding stack (May 2026)
Mem0 with LanceDB backend — no hosted memory service in the loop. All consolidation runs on the local LLM (vLLM endpoint); no third-party API calls. Cross-session memory works fully offline.

Pros

Drop-in API — minutes to integrate
Graph memory (Mem0g) leads 2026 benchmarks
Conflict detection on contradictory facts
Works with local LLMs

Cons

Cloud tier required for production scale
Graph extraction is LLM-cost heavy
Less control than Letta's explicit OS-style approach

Compatibility

Operating systems	macOS Linux Windows
GPU backends	n/a
License	Open source · free (OSS) + managed cloud tiers

Runtime health

Operator-grade signals on how actively Mem0 (agent memory API) is being maintained, how fresh its measurements are, and what failure classes operators have flagged. Every label below is anchored to a real date or count — we never infer maintainer activity we can't show.