05. Prompt Chaining Patterns
Simple prompting treats each request as independent. Prompt chaining sequences prompts where each step's output becomes the input to the next step. This enables workflows that no single prompt could perform reliably.
A canonical chain pattern: generation → evaluation → revision. The first prompt generates content, the second evaluates it against criteria, the third revises based on evaluation feedback. This decomposes the "generate good content" task into specialized subtasks, each with clearer success criteria than the undifferentiated original.
Another pattern: decomposition → parallel solving → aggregation. The first prompt identifies subtasks, subsequent prompts solve subtasks in parallel, and a final prompt aggregates results. This exploits parallelism where generation from independent inputs can happen simultaneously.
The critical design decision: where to split the chain? Splitting at the wrong boundary creates brittle dependencies. If step 2 assumes a specific output format from step 1, any variation in step 1's output breaks step 2. reliable chains build format expectations into each step or include format normalization between steps.
from dataclasses import dataclass
from typing import Optional
import json
import ollama
@dataclass
class ChainStep:
step_id: str
prompt_template: str
input_keys: list[str]
output_key: str
class PromptChain:
def __init__(self):
self.steps: list[ChainStep] = []
self.state: dict = {}
def add_step(self, step: ChainStep):
self.steps.append(step)
def execute(self, initial_inputs: dict) -> dict:
"""Execute the full chain from initial inputs."""
self.state = initial_inputs.copy()
for step in self.steps:
# Render prompt with current state
rendered = step.prompt_template
for key in step.input_keys:
value = self.state.get(key, "")
rendered = rendered.replace(f"{{{key}}}", str(value))
# Execute with retry logic
response = None
for attempt in range(3):
result = ollama.generate(
model='llama3.2',
prompt=rendered,
options={'temperature': 0.3}
)
# Basic success check (customize per use case)
if len(result['response']) > 10:
response = result['response']
break
if response is None:
raise RuntimeError(f"Step {step.step_id} failed after retries")
self.state[step.output_key] = response
return self.state
# Example: Article improvement chain
article_analysis = ChainStep(
step_id="analyze",
prompt_template="""Analyze this article for these specific issues:
1. Logical inconsistencies
2. Unsupported claims
3. Weak transitions between paragraphs
Article: {article_text}
Return JSON with keys: 'issues' (list of issue descriptions), 'overall_quality' (score 1-10).""",
input_keys=["article_text"],
output_key="analysis"
)
revision_guidance = ChainStep(
step_id="revision_plan",
prompt_template="""Based on the analysis, provide specific revision instructions.
Analysis: {analysis}
Original article: {article_text}
Revise the article to address each identified issue.
Output: {issues}
Return JSON with 'revision_steps' (ordered list of changes to make).""",
input_keys=["analysis", "article_text", "issues"],
output_key="revision_plan"
)
revision_execution = ChainStep(
step_id="revise",
prompt_template="""Revise this article following these specific steps.
Original: {article_text}
Revision plan: {revision_plan}
Return the complete revised article.""",
input_keys=["article_text", "revision_plan"],
output_key="revised_article"
)
# The chain self-documents its own structure
chain = PromptChain()
chain.add_step(article_analysis)
chain.add_step(revision_guidance)
chain.add_step(revision_execution)
# Execute
result = chain.execute({
"article_text": open("draft_article.txt").read()
})
print(result['revised_article'])
This example shows key patterns: each step is explicit about its inputs, intermediate state is stored for debugging, and format expectations are embedded in the prompt structure. The JSON output requirement creates a contract between steps that is easier to validate than unstructured text.
Common failures in chaining:
Context overflow: Each step adds to context length. After 4-5 steps in a long conversation, the model's context window fills with accumulated prompts and responses. Chains must include truncation or summarization steps for long workflows.
Error propagation: A weak output from step N degrades step N+1. Chaining without intermediate validation means errors compound. High-stakes chains should include verification steps that can trigger retry or escalation.
Oversplitting: Splitting into too many tiny steps creates excessive API calls and latency, plus makes debugging harder. The art is finding steps large enough to be meaningful but small enough to have clear success criteria.
Identify a multi-step task you currently perform with a single prompt. Decompose it into 2-3 steps with intermediate artifacts. Implement the chain and compare output quality and consistency to the single-prompt approach. The comparison should include both qualitative review and quantitative success metrics.