Ecosystem map · Updated May 6, 2026

The memory framework ecosystem

Six zones covering every layer of agent memory in 2026 — vector stores, graph memory, agent frameworks, MCP memory servers, local RAG frontends, and observability tooling. Read /systems/agent-memory first if you need the protocol-engineering depth before scanning the landscape.

By Fredoline Eruo · Reviewed monthly

Vector databases

The storage layer for episodic memory and RAG. LanceDB owns the embedded-first tier; Qdrant is the production single-node default; Milvus targets 100M+ vector scale; Weaviate has hybrid search and a richer query language; Redis Vector pairs naturally with cache-already-in-place stacks. Chroma is the dev-experience leader for prototypes.

serverOSS

LanceDB

★ 12k

Embedded vector + columnar database. Lance file format reads serverless from S3/local disk; no separate process to run. The pick for embedded apps and notebook workflows.

serverOSS

Qdrant

★ 24k

Vector database written in Rust. Strong filtering (payload-based pre-filter), HNSW index with quantization variants, gRPC + REST APIs. The performance pick when you cross 10M vectors.

serverOSS

Chroma

★ 17k

Open-source embedding database for LLM applications. The default 'just install pip and start' vector store for prototypes, with first-party clients in Python and JS. SQLite-backed locally, distributed

serverOSS

Weaviate

★ 13k

Vector database with built-in modules for embedding, generative search, and reranking. Schema-first design appeals to teams used to traditional databases. Generative-search module pairs with local Oll

serverOSS

Milvus

★ 30k

Distributed vector database designed for billion-scale workloads. Compute-storage separation, GPU-accelerated index builds, multi-tenant from the ground up. The pick when you've outgrown Qdrant single

serverOSS

Redis (vector search)

★ 65k

Vector search inside the same Redis you already run. HNSW + flat indices, hybrid filtering with FT.SEARCH. The pragmatic pick when you don't want to add another service to ops.

Graph memory + GraphRAG

Where multi-hop reasoning lives. Neo4j's GraphRAG patterns are upstream of how agents structure temporal knowledge graphs. Graphiti is the OSS agent-memory layer that uses Neo4j; Zep is the hosted product with the strongest temporal-graph API. Pick by hosted vs local and by how much memory complexity you want to manage yourself.

serverOSS

Neo4j GraphRAG

★ 2k

Neo4j's official GraphRAG toolkit — Python library + reference patterns for building retrieval-augmented generation against a knowledge graph. The mature pick for enterprises already running Neo4j.

serverOSS

Graphiti (Zep)

★ 6k

Temporal graph memory framework. Builds a bi-temporal knowledge graph from agent conversations, tracking when each fact was learned and when it was true. Powers Zep's hosted offering.

serverOSS

Zep (memory platform)

★ 4k

Long-term memory platform for AI agents. Sits above Graphiti as the application layer — sessions, facts, summaries, vector + graph hybrid retrieval. The 'memory backend you don't have to build' choice

Agent memory frameworks

The cross-session memory layer for agents. Mem0 is the drop-in API with implicit consolidation — fastest path from zero to working memory. Letta is OS-style explicit memory management — the agent itself reasons about what to remember and what to evict. Different abstractions for the same need; pick by control vs ergonomics.

agentOSS

Mem0 (agent memory API)

★ 28k

Drop-in memory layer for LLM agents. Vector + graph memory variants (Mem0g) — the graph variant builds a directed labeled knowledge graph alongside the vector store, with conflict detection on contrad

agentOSS

Letta (memory framework)

★ 18k

Agent memory framework that models memory like an operating system. Main context = RAM, archival storage = disk; the agent itself decides when to page. Originally MemGPT, now Letta. Model-agnostic (An

MCP memory servers

Memory exposed via the Model Context Protocol. mcp-server-memory is Anthropic's reference JSON-on-disk knowledge graph — entry-tier, perfect for personal Claude Desktop setups. mcp-server-postgres exposes structured-knowledge memory for exact lookup (with the SQL-injection caveat surfaced in /systems/mcp). The path most claude-code workflows take.

serverOSS

MCP Memory Server

★ 60k

Reference MCP server that gives an agent a persistent knowledge graph — entities, relations, observations stored to disk and surfaced back across sessions. The simplest path to making an agent remembe

serverOSS

MCP PostgreSQL Server

★ 60k

Reference MCP server that exposes a Postgres database as a query surface. Read-only by default — but worth flagging that early versions had a SQL-injection class issue where the read-only wrapper coul

Local RAG frontends

Where memory meets the user. AnythingLLM is the workspace-isolated RAG-first frontend; Open WebUI is the chat-first frontend with RAG bolted on. Both can drive any vector DB; both work with any OAI-compatible LLM. AnythingLLM wins for document-heavy workflows; Open WebUI wins for chat-heavy workflows that occasionally need RAG.

guiOSS

AnythingLLM

★ 32k

Document-oriented LLM frontend with workspaces. Connects to Ollama, LM Studio, OpenAI, Anthropic, etc. Strong document RAG.

guiOSS

Open WebUI

★ 80k

Self-hosted ChatGPT-style web frontend. Pairs with Ollama or any OpenAI-compatible backend. Multi-user, RAG built in, fast.

Observability and evaluation

Memory without auditing becomes confidently wrong. LangSmith is the LangChain-native trace + eval platform; Phoenix (Arize) is the OSS OpenInference-native alternative. Both surface memory drift, retrieval quality, and consolidation hallucinations — but only if you wire them in.

orchestrator

LangSmith

LangChain's observability + evaluation platform. Trace agent runs, run evaluators against benchmark suites, version prompts. The dominant trace+eval tool for the LangChain/LangGraph ecosystem.

orchestratorOSS

Phoenix (Arize AI)

★ 7k

Open-source LLM tracing + evaluation. OpenInference standard for traces; runs locally with one pip install. The OSS-first pick for teams that want LangSmith-shaped functionality without vendor lock-in

Category leaders

The tools that compounded their ecosystem leads through the 2025-2026 cycle:

Mem0 — drop-in agent memory. Default for most new memory-enabled deployments thanks to ergonomics and fast wiring.
Letta — OS-style explicit memory. v0.7 (April 2026) made this genuinely usable. The pick when deterministic memory state matters.
LanceDB — embedded-first vector storage. The default for offline / single-process deployments; scales further than Chroma before needing a server.
Qdrant — production single-node vector DB. Best ops surface in the category; PQ quantization makes it the right pick for storage-constrained deployments.
Graphiti — OSS graph-memory now production- ready (1.0 release). The local-first alternative to Zep.

Declining tools

Tools whose ecosystem position has softened through the cycle:

Pinecone for new local-AI deployments. Cloud-only and expensive at scale; teams default to local (LanceDB / Qdrant) or hosted-OSS (Qdrant Cloud).
Bare LangChain memory primitives. They still work but the dedicated memory frameworks (Mem0, Letta, Zep, Graphiti) ship better abstractions; LangChain memory is now mostly a reference implementation.

Best memory stack by use case

Single-user coding agent (private codebase): Mem0 (LanceDB) + MCP-postgres + MCP-git. Local; fast; private. See /stacks/memory-enabled-agent.

Long-horizon planning agent: Letta + (optional Graphiti for multi-hop). The explicit memory hierarchy beats implicit consolidation when the agent needs to reason about its own memory state.

Team-shared chat with cross-machine continuity: Zep cloud + Open WebUI. Hosted; cross-machine; the right pick for teams comfortable with hosted memory.

Air-gapped RAG over private documents: AnythingLLM + LanceDB + Ollama. No memory framework — just workspace-isolated retrieval. See /stacks/offline-rag-workstation.

Personal Claude Desktop with simple persistence: mcp-server-memory. Trivial setup; right ceiling for single-user JSON-on-disk knowledge graph.

Missing catalog entries

Tools we've evaluated but haven't yet added to the catalog. SENTINEL flags them when stacks reference missing slugs:

Mem0g — Mem0's graph variant. Same project, different storage shape. Currently lives under the Mem0 catalog entry as a feature.
Pinecone — intentionally not in catalog. Cloud-only; doesn't fit the local-AI editorial scope.
SQLite-Vec — embedded vector store via SQLite extension. Worth adding for the “vector search without a real vector DB” tier.

How this map updates

This page reads its zones live from the catalog. New tools land in scripts/seed/tools.ts and show up here on the next deploy when their slug is referenced in a zone above. Editorial framing — zone titles, blurbs, “what changed this month,” category leaders / declining tools / best-by-use-case — is hand-written and refreshed on the first business day of each month. Inclusion bar: a tool has to be one we've actually used and can write operator notes about.

Going deeper

What agent memory actually is — the architectural depth this map assumes.
/stacks/memory-enabled-agent — the canonical memory-enabled deployment recipe.
Local AI agent ecosystem — where memory frameworks plug into the broader agent map.
MCP ecosystem — protocol layer where MCP-memory + MCP-postgres live.