Agent Memory — Introduction to AI Agents (Chapter 10)

Agents need memory to retain information across multiple turns. Without memory, each conversation turn starts from scratch and the agent cannot build on prior actions.

Message history as memory

The simplest memory is a message list. Every turn appends the new exchange to the list:

class SimpleAgentMemory:
    def __init__(self, max_messages: int = 40):
        self.max_messages = max_messages
        self.messages = []
    
    def add_turn(self, role: str, content: str, tool_calls: list = None):
        entry = {"role": role, "content": content}
        if tool_calls:
            entry["tool_calls"] = tool_calls
        self.messages.append(entry)
        self._trim()
    
    def _trim(self):
        """Remove oldest non-system messages if limit exceeded"""
        system = [m for m in self.messages if m["role"] == "system"]
        others = [m for m in self.messages if m["role"] != "system"]
        if len(self.messages) > self.max_messages:
            self.messages = system + others[-(self.max_messages - len(system)):]
    
    def get_context(self, limit: int = None) -> list:
        if limit:
            return self.messages[-limit:]
        return self.messages

Summarization-based compression

For long conversations, token usage grows linearly. Compress the history by summarizing turns that are no longer immediately relevant:

def summarize_old_history(memory: SimpleAgentMemory, model, summarize_before: int = 20):
    if len(memory.messages) < summarize_before:
        return
    
    old_messages = memory.messages[:-5]  # Keep last 5 turns
    last_turns = memory.messages[-5:]
    
    summary_prompt = (
        "Summarize the following conversation in 2-3 sentences. "
        "Focus on key decisions, facts learned, and tasks completed.\n\n" +
        "\n".join(f"{m['role']}: {m['content'][:200]}" for m in old_messages)
    )
    
    summary_response = model.chat([
        {"role": "user", "content": summary_prompt}
    ])
    
    memory.messages = [{"role": "system", "content": f"Previous context: {summary_response.content}"}] + last_turns

Structured memory

Store specific facts in structured slots to avoid re-discovering known information:

class StructuredMemory:
    def __init__(self):
        self.facts = {}  # key: value pairs
        self.tasks_completed = []
        self.pending_tasks = []
    
    def store_fact(self, key: str, value: Any):
        self.facts[key] = value
    
    def get_fact(self, key: str) -> Optional[Any]:
        return self.facts.get(key)
    
    def mark_complete(self, task: str):
        self.tasks_completed.append(task)
        if task in self.pending_tasks:
            self.pending_tasks.remove(task)