RUNLOCALAIv38
->Will it run?Best GPUCompareTroubleshootStartLearnPulseModelsHardwareToolsBench
Run check
RUNLOCALAI

Independently operated catalog for local-AI hardware and software. Hand-written verdicts. Source-cited claims. Reproducible commands when we have them.

OP·Fredoline Eruo
DIR
  • Models
  • Hardware
  • Tools
  • Benchmarks
TOOLS
  • Will it run?
  • Compare hardware
  • Cost vs cloud
  • Choose my GPU
  • Prompting kits
  • Quick answers
REF
  • All buyer guides
  • Learn local AI
  • Methodology
  • Glossary
  • Errors KB
  • Trust
EDITOR
  • About
  • Author
  • How we make money
  • Editorial policy
  • Contact
LEGAL
  • Privacy
  • Terms
  • Sitemap
MAIL · MONTHLY DIGEST
Get monthly local AI changes
Monthly recap. No spam.
DISCLOSURE

Some links on this site are affiliate links (Amazon Associates and other first-class retailers). When you buy through them, we earn a small commission at no extra cost to you. Affiliate links do not influence our verdicts — there are cards we rate highly that we don't have affiliate relationships with, and cards that sell well that we refuse to recommend. Read more →

© 2026 runlocalai.coIndependently operated
RUNLOCALAI · v38
  1. >
  2. Home
  3. /Learn
  4. /Courses
  5. /Introduction to AI Agents
  6. /Ch. 10
Introduction to AI Agents

10. Agent Memory

Chapter 10 of 16 · 20 min
KEY INSIGHT

Memory is not just the conversation transcript. It includes compressed summaries, structured facts, and session state. Without active memory management, context windows fill up and older relevant facts become inaccessible.

Agents need memory to retain information across multiple turns. Without memory, each conversation turn starts from scratch and the agent cannot build on prior actions.

Message history as memory

The simplest memory is a message list. Every turn appends the new exchange to the list:

class SimpleAgentMemory:
    def __init__(self, max_messages: int = 40):
        self.max_messages = max_messages
        self.messages = []
    
    def add_turn(self, role: str, content: str, tool_calls: list = None):
        entry = {"role": role, "content": content}
        if tool_calls:
            entry["tool_calls"] = tool_calls
        self.messages.append(entry)
        self._trim()
    
    def _trim(self):
        """Remove oldest non-system messages if limit exceeded"""
        system = [m for m in self.messages if m["role"] == "system"]
        others = [m for m in self.messages if m["role"] != "system"]
        if len(self.messages) > self.max_messages:
            self.messages = system + others[-(self.max_messages - len(system)):]
    
    def get_context(self, limit: int = None) -> list:
        if limit:
            return self.messages[-limit:]
        return self.messages

Summarization-based compression

For long conversations, token usage grows linearly. Compress the history by summarizing turns that are no longer immediately relevant:

def summarize_old_history(memory: SimpleAgentMemory, model, summarize_before: int = 20):
    if len(memory.messages) < summarize_before:
        return
    
    old_messages = memory.messages[:-5]  # Keep last 5 turns
    last_turns = memory.messages[-5:]
    
    summary_prompt = (
        "Summarize the following conversation in 2-3 sentences. "
        "Focus on key decisions, facts learned, and tasks completed.\n\n" +
        "\n".join(f"{m['role']}: {m['content'][:200]}" for m in old_messages)
    )
    
    summary_response = model.chat([
        {"role": "user", "content": summary_prompt}
    ])
    
    memory.messages = [{"role": "system", "content": f"Previous context: {summary_response.content}"}] + last_turns

Structured memory

Store specific facts in structured slots to avoid re-discovering known information:

class StructuredMemory:
    def __init__(self):
        self.facts = {}  # key: value pairs
        self.tasks_completed = []
        self.pending_tasks = []
    
    def store_fact(self, key: str, value: Any):
        self.facts[key] = value
    
    def get_fact(self, key: str) -> Optional[Any]:
        return self.facts.get(key)
    
    def mark_complete(self, task: str):
        self.tasks_completed.append(task)
        if task in self.pending_tasks:
            self.pending_tasks.remove(task)
EXERCISE

Implement a memory system that automatically summarizes every 10 turns. Run a 20-turn conversation and verify that after summarization, the agent still remembers key facts from the early turns.

← Chapter 9
Multi-Tool Agents
Chapter 11 →
Conversation History