HOW-TO · RAG
How to Use Vector Store as Agent Memory
Target environment
Ubuntu 24.04 · Ollama 0.4.x
PREREQUISITES
Vector store running (ChromaDB, Qdrant), agent framework, Python 3.10+
What this does
A vector store acts as the agent's external memory by storing past interactions, facts, and documents as embeddings. The agent retrieves semantically relevant memories when processing new inputs.
Steps
- Initialize the vector store with an embedding function.
from langchain_community.vectorstores import Chroma
from langchain_community.embeddings import OllamaEmbeddings
embeddings = OllamaEmbeddings(model="nomic-embed-text")
memory_store = Chroma(
embedding_function=embeddings,
collection_name="agent_memory",
persist_directory="./memory_store"
)
- Store memories as documents with metadata. Include timestamps and type tags.
from langchain.schema import Document
from datetime import datetime
def save_memory(content: str, memory_type: str = "fact", importance: float = 0.5):
doc = Document(
page_content=content,
metadata={
"type": memory_type,
"timestamp": datetime.now().isoformat(),
"importance": importance
}
)
memory_store.add_documents([doc])
- Retrieve memories relevant to the current input.
def retrieve_memories(query: str, k: int = 5, memory_type: str = None) -> list[str]:
filter = {"type": memory_type} if memory_type else None
docs = memory_store.similarity_search(query, k=k, filter=filter)
return [d.page_content for d in docs]
- Create a memory retrieval tool for the agent. Let the agent query its own memory.
from langchain.tools import tool
@tool
def query_memory(query: str) -> str:
"""Search the agent's long-term memory for relevant information."""
memories = retrieve_memories(query, k=3)
if not memories:
return "No relevant memories found."
return "\n".join(f"- {m}" for m in memories)
@tool
def save_to_memory(fact: str) -> str:
"""Save an important fact to long-term memory."""
save_memory(fact, memory_type="fact")
return f"Saved: {fact}"
- Prune old or low-importance memories. Prevent unbounded growth.
def prune_memories(max_age_days: int = 30, min_importance: float = 0.3):
all_docs = memory_store.get()
to_delete = []
for i, meta in enumerate(all_docs["metadatas"]):
age = datetime.now() - datetime.fromisoformat(meta["timestamp"])
if age.days > max_age_days and meta["importance"] < min_importance:
to_delete.append(all_docs["ids"][i])
if to_delete:
memory_store.delete(to_delete)
return len(to_delete)
- Inject memories into the agent prompt. Add retrieved memories as context.
def build_prompt_with_memory(user_input: str) -> str:
memories = retrieve_memories(user_input)
memory_block = "\n".join(f"[Memory] {m}" for m in memories)
return f"{memory_block}\n\nUser: {user_input}\nAssistant:"
Verification
python -c "
from langchain_community.vectorstores import Chroma
from langchain_community.embeddings import OllamaEmbeddings
emb = OllamaEmbeddings(model='nomic-embed-text')
vs = Chroma(embedding_function=emb, collection_name='test_mem')
vs.add_documents([Document(page_content='test fact')])
r = vs.similarity_search('test', k=1)
print(len(r))
# Expected: 1
"
Common failures
- Memory persistence lost without
persist_directory. ChromaDB in-memory stores vanish when the process restarts. Always setpersist_directory. - Retrieved memories are irrelevant. The embedding model may not capture the semantic relationship. Try a different embedding model or increase
kand rerank. - Duplicate memories accumulate. The same fact stored multiple times wastes space. Check for semantic similarity before storing.
- Version mismatch - The installed package or runtime differs from the command shown; check the version first and rerun the smallest verification command.
- Local environment drift - Another service, virtual environment, model, or path is being used; print the active binary path and configuration before changing the guide steps.
Related guides
- How to Implement Agent Memory (Short and Long Term)
- How to Apply Metadata Filters to Reduce Search Space