What this does

A vector store acts as the agent's external memory by storing past interactions, facts, and documents as embeddings. The agent retrieves semantically relevant memories when processing new inputs.

Steps

Initialize the vector store with an embedding function.

from langchain_community.vectorstores import Chroma
from langchain_community.embeddings import OllamaEmbeddings

embeddings = OllamaEmbeddings(model="nomic-embed-text")
memory_store = Chroma(
    embedding_function=embeddings,
    collection_name="agent_memory",
    persist_directory="./memory_store"
)

Store memories as documents with metadata. Include timestamps and type tags.

from langchain.schema import Document
from datetime import datetime

def save_memory(content: str, memory_type: str = "fact", importance: float = 0.5):
    doc = Document(
        page_content=content,
        metadata={
            "type": memory_type,
            "timestamp": datetime.now().isoformat(),
            "importance": importance
        }
    )
    memory_store.add_documents([doc])

Retrieve memories relevant to the current input.

def retrieve_memories(query: str, k: int = 5, memory_type: str = None) -> list[str]:
    filter = {"type": memory_type} if memory_type else None
    docs = memory_store.similarity_search(query, k=k, filter=filter)
    return [d.page_content for d in docs]

Create a memory retrieval tool for the agent. Let the agent query its own memory.

from langchain.tools import tool

@tool
def query_memory(query: str) -> str:
    """Search the agent's long-term memory for relevant information."""
    memories = retrieve_memories(query, k=3)
    if not memories:
        return "No relevant memories found."
    return "\n".join(f"- {m}" for m in memories)

@tool
def save_to_memory(fact: str) -> str:
    """Save an important fact to long-term memory."""
    save_memory(fact, memory_type="fact")
    return f"Saved: {fact}"

Prune old or low-importance memories. Prevent unbounded growth.

def prune_memories(max_age_days: int = 30, min_importance: float = 0.3):
    all_docs = memory_store.get()
    to_delete = []
    for i, meta in enumerate(all_docs["metadatas"]):
        age = datetime.now() - datetime.fromisoformat(meta["timestamp"])
        if age.days > max_age_days and meta["importance"] < min_importance:
            to_delete.append(all_docs["ids"][i])
    if to_delete:
        memory_store.delete(to_delete)
    return len(to_delete)

Inject memories into the agent prompt. Add retrieved memories as context.

def build_prompt_with_memory(user_input: str) -> str:
    memories = retrieve_memories(user_input)
    memory_block = "\n".join(f"[Memory] {m}" for m in memories)
    return f"{memory_block}\n\nUser: {user_input}\nAssistant:"

Verification

python -c "
from langchain_community.vectorstores import Chroma
from langchain_community.embeddings import OllamaEmbeddings
emb = OllamaEmbeddings(model='nomic-embed-text')
vs = Chroma(embedding_function=emb, collection_name='test_mem')
vs.add_documents([Document(page_content='test fact')])
r = vs.similarity_search('test', k=1)
print(len(r))
# Expected: 1
"

Common failures

Memory persistence lost without persist_directory. ChromaDB in-memory stores vanish when the process restarts. Always set persist_directory.
Retrieved memories are irrelevant. The embedding model may not capture the semantic relationship. Try a different embedding model or increase k and rerank.
Duplicate memories accumulate. The same fact stored multiple times wastes space. Check for semantic similarity before storing.
Version mismatch - The installed package or runtime differs from the command shown; check the version first and rerun the smallest verification command.
Local environment drift - Another service, virtual environment, model, or path is being used; print the active binary path and configuration before changing the guide steps.

Related guides

How to Implement Agent Memory (Short and Long Term)
How to Apply Metadata Filters to Reduce Search Space