HOW-TO · RAG
How to Implement Memory in LangGraph State
Target environment
Ubuntu 24.04 · Ollama 0.4.x
PREREQUISITES
LangGraph agent set up, Python 3.10+
What this does
Memory in LangGraph persists conversation history across turns by using the state schema with reducers. Each turn appends messages to the state rather than replacing them, enabling multi-turn context.
Steps
- Define state with message accumulation. Use the
add_messagesreducer to append.
from typing import TypedDict, Annotated, Sequence
from langgraph.graph.message import add_messages
from langchain_core.messages import BaseMessage, HumanMessage, AIMessage
class StateWithMemory(TypedDict):
messages: Annotated[Sequence[BaseMessage], add_messages]
user_name: str
- Create a node that accesses prior messages. The agent sees the full history.
from langchain_ollama import ChatOllama
llm = ChatOllama(model="llama3.2", temperature=0)
def chat_node(state: StateWithMemory) -> dict:
response = llm.invoke(state["messages"])
return {"messages": [response]}
- Build and run the graph across multiple turns. Use a checkpointer to persist state between invocations.
from langgraph.graph import StateGraph, END
from langgraph.checkpoint import MemorySaver
workflow = StateGraph(StateWithMemory)
workflow.add_node("chat", chat_node)
workflow.set_entry_point("chat")
workflow.add_edge("chat", END)
memory = MemorySaver()
app = workflow.compile(checkpointer=memory)
config = {"configurable": {"thread_id": "user-1"}}
# Turn 1
result = app.invoke({"messages": [HumanMessage(content="Hi, my name is Alice.")], "user_name": ""}, config)
print(result["messages"][-1].content)
# Turn 2 — the agent remembers "Alice"
result = app.invoke({"messages": [HumanMessage(content="What's my name?")]}, config)
print(result["messages"][-1].content)
# Expected: Your name is Alice.
- Implement summary memory for long sessions. When message history grows too large, summarize old messages.
def summarize_memory(state: StateWithMemory) -> dict:
messages = state["messages"]
if len(messages) > 10:
summary_prompt = f"Summarize this conversation:\n\n{messages[:-5]}"
summary = llm.invoke([HumanMessage(content=summary_prompt)])
return {"messages": [AIMessage(content=f"[Summary] {summary.content}")] + messages[-5:]}
return {}
- Add a memory node to the graph. Run it periodically (e.g., every 5 turns).
workflow.add_node("summarize", summarize_memory)
workflow.add_edge("chat", "summarize")
workflow.add_edge("summarize", END)
Verification
python -c "
from langgraph.graph.message import add_messages
from langchain_core.messages import HumanMessage, AIMessage
msgs = add_messages([], [HumanMessage(content='Hi')])
msgs = add_messages(msgs, [AIMessage(content='Hello')])
print(len(msgs))
# Expected: 2
"
Common failures
- State not persisted between turns. Without a checkpointer (e.g.,
MemorySaver), state is ephemeral and each invocation starts fresh. - Messages array grows unbounded. No summarization leads to context window overflow. Add a summarization node that triggers periodically.
- Duplicate messages from re-invocation. The
add_messagesreducer deduplicates by message ID, but if IDs are missing, duplicates appear. Always let LangChain auto-generate IDs. - Version mismatch - The installed package or runtime differs from the command shown; check the version first and rerun the smallest verification command.
- Local environment drift - Another service, virtual environment, model, or path is being used; print the active binary path and configuration before changing the guide steps.
Related guides
- How to Define State Schema in LangGraph
- How to Create ConversationalRetrievalChain with Memory