RUNLOCALAIv38
->Will it run?Best GPUCompareTroubleshootStartLearnPulseModelsHardwareToolsBench
Run check
RUNLOCALAI

Independently operated catalog for local-AI hardware and software. Hand-written verdicts. Source-cited claims. Reproducible commands when we have them.

OP·Fredoline Eruo
DIR
  • Models
  • Hardware
  • Tools
  • Benchmarks
TOOLS
  • Will it run?
  • Compare hardware
  • Cost vs cloud
  • Choose my GPU
  • Prompting kits
  • Quick answers
REF
  • All buyer guides
  • Learn local AI
  • Methodology
  • Glossary
  • Errors KB
  • Trust
EDITOR
  • About
  • Author
  • How we make money
  • Editorial policy
  • Contact
LEGAL
  • Privacy
  • Terms
  • Sitemap
MAIL · MONTHLY DIGEST
Get monthly local AI changes
Monthly recap. No spam.
DISCLOSURE

Some links on this site are affiliate links (Amazon Associates and other first-class retailers). When you buy through them, we earn a small commission at no extra cost to you. Affiliate links do not influence our verdicts — there are cards we rate highly that we don't have affiliate relationships with, and cards that sell well that we refuse to recommend. Read more →

© 2026 runlocalai.coIndependently operated
RUNLOCALAI · v38
  1. >
  2. Home
  3. /Maps
  4. /Memory frameworks (May 2026)
Ecosystem map · Updated May 6, 2026

The memory framework ecosystem

Six zones covering every layer of agent memory in 2026 — vector stores, graph memory, agent frameworks, MCP memory servers, local RAG frontends, and observability tooling. Read /systems/agent-memory first if you need the protocol-engineering depth before scanning the landscape.

By Fredoline Eruo · Reviewed monthly
ℹWhat changed this month
  • Mem0g (graph memory variant) took the multi-hop agent-memory benchmark lead at 68.4% LLM Score in April. The flat-vector vs graph-memory architectural debate now has clear empirical evidence on the graph side for multi-hop tasks.
  • Letta v0.7 shipped explicit-memory hierarchy in v0.7 (April 2026). The OS-style abstraction is now genuinely usable rather than research-quality.
  • Graphiti reached 1.0 with stable Neo4j integration and a polished agent-memory API. The OSS counterpart to Zep is now production-ready for teams that want full local control.
  • Postgres MCP CVE from earlier in the year is still relevant — pin current versions and run with a least-privilege role. Surface in any production deployment.
Zones
  1. Vector databases
  2. Graph memory + GraphRAG
  3. Agent memory frameworks
  4. MCP memory servers
  5. Local RAG frontends
  6. Observability and evaluation

Vector databases

The storage layer for episodic memory and RAG. LanceDB owns the embedded-first tier; Qdrant is the production single-node default; Milvus targets 100M+ vector scale; Weaviate has hybrid search and a richer query language; Redis Vector pairs naturally with cache-already-in-place stacks. Chroma is the dev-experience leader for prototypes.

serverOSS

LanceDB

★ 12k

Embedded vector + columnar database. Lance file format reads serverless from S3/local disk; no separate process to run. The pick for embedded apps and notebook workflows.

serverOSS

Qdrant

★ 24k

Vector database written in Rust. Strong filtering (payload-based pre-filter), HNSW index with quantization variants, gRPC + REST APIs. The performance pick when you cross 10M vectors.

serverOSS

Chroma

★ 17k

Open-source embedding database for LLM applications. The default 'just install pip and start' vector store for prototypes, with first-party clients in Python and JS. SQLite-backed locally, distributed

serverOSS

Weaviate

★ 13k

Vector database with built-in modules for embedding, generative search, and reranking. Schema-first design appeals to teams used to traditional databases. Generative-search module pairs with local Oll

serverOSS

Milvus

★ 30k

Distributed vector database designed for billion-scale workloads. Compute-storage separation, GPU-accelerated index builds, multi-tenant from the ground up. The pick when you've outgrown Qdrant single

serverOSS

Redis (vector search)

★ 65k

Vector search inside the same Redis you already run. HNSW + flat indices, hybrid filtering with FT.SEARCH. The pragmatic pick when you don't want to add another service to ops.

Graph memory + GraphRAG

Where multi-hop reasoning lives. Neo4j's GraphRAG patterns are upstream of how agents structure temporal knowledge graphs. Graphiti is the OSS agent-memory layer that uses Neo4j; Zep is the hosted product with the strongest temporal-graph API. Pick by hosted vs local and by how much memory complexity you want to manage yourself.

serverOSS

Neo4j GraphRAG

★ 2k

Neo4j's official GraphRAG toolkit — Python library + reference patterns for building retrieval-augmented generation against a knowledge graph. The mature pick for enterprises already running Neo4j.

serverOSS

Graphiti (Zep)

★ 6k

Temporal graph memory framework. Builds a bi-temporal knowledge graph from agent conversations, tracking when each fact was learned and when it was true. Powers Zep's hosted offering.

serverOSS

Zep (memory platform)

★ 4k

Long-term memory platform for AI agents. Sits above Graphiti as the application layer — sessions, facts, summaries, vector + graph hybrid retrieval. The 'memory backend you don't have to build' choice

Agent memory frameworks

The cross-session memory layer for agents. Mem0 is the drop-in API with implicit consolidation — fastest path from zero to working memory. Letta is OS-style explicit memory management — the agent itself reasons about what to remember and what to evict. Different abstractions for the same need; pick by control vs ergonomics.

agentOSS

Mem0 (agent memory API)

★ 28k

Drop-in memory layer for LLM agents. Vector + graph memory variants (Mem0g) — the graph variant builds a directed labeled knowledge graph alongside the vector store, with conflict detection on contrad

agentOSS

Letta (memory framework)

★ 18k

Agent memory framework that models memory like an operating system. Main context = RAM, archival storage = disk; the agent itself decides when to page. Originally MemGPT, now Letta. Model-agnostic (An

MCP memory servers

Memory exposed via the Model Context Protocol. mcp-server-memory is Anthropic's reference JSON-on-disk knowledge graph — entry-tier, perfect for personal Claude Desktop setups. mcp-server-postgres exposes structured-knowledge memory for exact lookup (with the SQL-injection caveat surfaced in /systems/mcp). The path most claude-code workflows take.

serverOSS

MCP Memory Server

★ 60k

Reference MCP server that gives an agent a persistent knowledge graph — entities, relations, observations stored to disk and surfaced back across sessions. The simplest path to making an agent remembe

serverOSS

MCP PostgreSQL Server

★ 60k

Reference MCP server that exposes a Postgres database as a query surface. Read-only by default — but worth flagging that early versions had a SQL-injection class issue where the read-only wrapper coul

Local RAG frontends

Where memory meets the user. AnythingLLM is the workspace-isolated RAG-first frontend; Open WebUI is the chat-first frontend with RAG bolted on. Both can drive any vector DB; both work with any OAI-compatible LLM. AnythingLLM wins for document-heavy workflows; Open WebUI wins for chat-heavy workflows that occasionally need RAG.

guiOSS

AnythingLLM

★ 32k

Document-oriented LLM frontend with workspaces. Connects to Ollama, LM Studio, OpenAI, Anthropic, etc. Strong document RAG.

guiOSS

Open WebUI

★ 80k

Self-hosted ChatGPT-style web frontend. Pairs with Ollama or any OpenAI-compatible backend. Multi-user, RAG built in, fast.

Observability and evaluation

Memory without auditing becomes confidently wrong. LangSmith is the LangChain-native trace + eval platform; Phoenix (Arize) is the OSS OpenInference-native alternative. Both surface memory drift, retrieval quality, and consolidation hallucinations — but only if you wire them in.

orchestrator

LangSmith

LangChain's observability + evaluation platform. Trace agent runs, run evaluators against benchmark suites, version prompts. The dominant trace+eval tool for the LangChain/LangGraph ecosystem.

orchestratorOSS

Phoenix (Arize AI)

★ 7k

Open-source LLM tracing + evaluation. OpenInference standard for traces; runs locally with one pip install. The OSS-first pick for teams that want LangSmith-shaped functionality without vendor lock-in

Category leaders

The tools that compounded their ecosystem leads through the 2025-2026 cycle:

  • Mem0 — drop-in agent memory. Default for most new memory-enabled deployments thanks to ergonomics and fast wiring.
  • Letta — OS-style explicit memory. v0.7 (April 2026) made this genuinely usable. The pick when deterministic memory state matters.
  • LanceDB — embedded-first vector storage. The default for offline / single-process deployments; scales further than Chroma before needing a server.
  • Qdrant — production single-node vector DB. Best ops surface in the category; PQ quantization makes it the right pick for storage-constrained deployments.
  • Graphiti — OSS graph-memory now production- ready (1.0 release). The local-first alternative to Zep.

Declining tools

Tools whose ecosystem position has softened through the cycle:

  • Pinecone for new local-AI deployments. Cloud-only and expensive at scale; teams default to local (LanceDB / Qdrant) or hosted-OSS (Qdrant Cloud).
  • Bare LangChain memory primitives. They still work but the dedicated memory frameworks (Mem0, Letta, Zep, Graphiti) ship better abstractions; LangChain memory is now mostly a reference implementation.

Best memory stack by use case

Single-user coding agent (private codebase): Mem0 (LanceDB) + MCP-postgres + MCP-git. Local; fast; private. See /stacks/memory-enabled-agent.
Long-horizon planning agent: Letta + (optional Graphiti for multi-hop). The explicit memory hierarchy beats implicit consolidation when the agent needs to reason about its own memory state.
Team-shared chat with cross-machine continuity: Zep cloud + Open WebUI. Hosted; cross-machine; the right pick for teams comfortable with hosted memory.
Air-gapped RAG over private documents: AnythingLLM + LanceDB + Ollama. No memory framework — just workspace-isolated retrieval. See /stacks/offline-rag-workstation.
Personal Claude Desktop with simple persistence: mcp-server-memory. Trivial setup; right ceiling for single-user JSON-on-disk knowledge graph.

Missing catalog entries

Tools we've evaluated but haven't yet added to the catalog. SENTINEL flags them when stacks reference missing slugs:

  • Mem0g — Mem0's graph variant. Same project, different storage shape. Currently lives under the Mem0 catalog entry as a feature.
  • Pinecone — intentionally not in catalog. Cloud-only; doesn't fit the local-AI editorial scope.
  • SQLite-Vec — embedded vector store via SQLite extension. Worth adding for the “vector search without a real vector DB” tier.

How this map updates

This page reads its zones live from the catalog. New tools land in scripts/seed/tools.ts and show up here on the next deploy when their slug is referenced in a zone above. Editorial framing — zone titles, blurbs, “what changed this month,” category leaders / declining tools / best-by-use-case — is hand-written and refreshed on the first business day of each month. Inclusion bar: a tool has to be one we've actually used and can write operator notes about.

Going deeper

  • What agent memory actually is — the architectural depth this map assumes.
  • /stacks/memory-enabled-agent — the canonical memory-enabled deployment recipe.
  • Local AI agent ecosystem — where memory frameworks plug into the broader agent map.
  • MCP ecosystem — protocol layer where MCP-memory + MCP-postgres live.