HOW-TO · RAG

How to Use Vector Store as Agent Memory

intermediate20 minBy Fredoline Eruo
Target environment
Ubuntu 24.04 · Ollama 0.4.x
PREREQUISITES

Vector store running (ChromaDB, Qdrant), agent framework, Python 3.10+

What this does

A vector store acts as the agent's external memory by storing past interactions, facts, and documents as embeddings. The agent retrieves semantically relevant memories when processing new inputs.

Steps

  • Initialize the vector store with an embedding function.
from langchain_community.vectorstores import Chroma
from langchain_community.embeddings import OllamaEmbeddings

embeddings = OllamaEmbeddings(model="nomic-embed-text")
memory_store = Chroma(
    embedding_function=embeddings,
    collection_name="agent_memory",
    persist_directory="./memory_store"
)
  • Store memories as documents with metadata. Include timestamps and type tags.
from langchain.schema import Document
from datetime import datetime

def save_memory(content: str, memory_type: str = "fact", importance: float = 0.5):
    doc = Document(
        page_content=content,
        metadata={
            "type": memory_type,
            "timestamp": datetime.now().isoformat(),
            "importance": importance
        }
    )
    memory_store.add_documents([doc])
  • Retrieve memories relevant to the current input.
def retrieve_memories(query: str, k: int = 5, memory_type: str = None) -> list[str]:
    filter = {"type": memory_type} if memory_type else None
    docs = memory_store.similarity_search(query, k=k, filter=filter)
    return [d.page_content for d in docs]
  • Create a memory retrieval tool for the agent. Let the agent query its own memory.
from langchain.tools import tool

@tool
def query_memory(query: str) -> str:
    """Search the agent's long-term memory for relevant information."""
    memories = retrieve_memories(query, k=3)
    if not memories:
        return "No relevant memories found."
    return "\n".join(f"- {m}" for m in memories)

@tool
def save_to_memory(fact: str) -> str:
    """Save an important fact to long-term memory."""
    save_memory(fact, memory_type="fact")
    return f"Saved: {fact}"
  • Prune old or low-importance memories. Prevent unbounded growth.
def prune_memories(max_age_days: int = 30, min_importance: float = 0.3):
    all_docs = memory_store.get()
    to_delete = []
    for i, meta in enumerate(all_docs["metadatas"]):
        age = datetime.now() - datetime.fromisoformat(meta["timestamp"])
        if age.days > max_age_days and meta["importance"] < min_importance:
            to_delete.append(all_docs["ids"][i])
    if to_delete:
        memory_store.delete(to_delete)
    return len(to_delete)
  • Inject memories into the agent prompt. Add retrieved memories as context.
def build_prompt_with_memory(user_input: str) -> str:
    memories = retrieve_memories(user_input)
    memory_block = "\n".join(f"[Memory] {m}" for m in memories)
    return f"{memory_block}\n\nUser: {user_input}\nAssistant:"

Verification

python -c "
from langchain_community.vectorstores import Chroma
from langchain_community.embeddings import OllamaEmbeddings
emb = OllamaEmbeddings(model='nomic-embed-text')
vs = Chroma(embedding_function=emb, collection_name='test_mem')
vs.add_documents([Document(page_content='test fact')])
r = vs.similarity_search('test', k=1)
print(len(r))
# Expected: 1
"

Common failures

  • Memory persistence lost without persist_directory. ChromaDB in-memory stores vanish when the process restarts. Always set persist_directory.
  • Retrieved memories are irrelevant. The embedding model may not capture the semantic relationship. Try a different embedding model or increase k and rerank.
  • Duplicate memories accumulate. The same fact stored multiple times wastes space. Check for semantic similarity before storing.
  • Version mismatch - The installed package or runtime differs from the command shown; check the version first and rerun the smallest verification command.
  • Local environment drift - Another service, virtual environment, model, or path is being used; print the active binary path and configuration before changing the guide steps.

Related guides

  • How to Implement Agent Memory (Short and Long Term)
  • How to Apply Metadata Filters to Reduce Search Space