Chain-of-Thought — Prompt Engineering Fundamentals (Chapter 5)

Chain-of-thought (CoT) prompting asks the model to show its reasoning before outputting the final answer. This works because language models produce tokens sequentially—what comes before the answer affects the answer.

Standard CoT: Add "Let's think step by step" or "Explain your reasoning" to the prompt.

Question: If a store has 47 apples and receives a shipment of 12 more, then sells 23 apples, how many apples remain?

Let's think step through this:
1. Store starts with 47 apples
2. Shipment adds 12 apples: 47 + 12 = 59
3. Sold 23 apples: 59 - 23 = 36

Answer: 36

The intermediate steps constrain the final answer. If the model makes an error in step 2, it appears visibly—and may self-correct in later steps.

CoT helps when:

The task has multiple steps
Errors are hard to detect in final output
You need to audit reasoning
The problem involves arithmetic or logical chains

CoT is less helpful for:

Single-step tasks
Tasks where intermediate steps are subjective
High-latency applications (CoT increases token count significantly)

Self-consistency CoT runs the same prompt multiple times and selects the most frequent final answer:

Question: [question]

Generate three different reasoning paths, then state the most confident answer.

Path 1: [reasoning]
Path 2: [reasoning]
Path 3: [reasoning]

Final answer: [most consistent result]

This reduces variance in complex reasoning tasks. Tested on GSM8K (grade school math), self-consistency improved accuracy from 74.4% to 83.4% on Llama-2-70B.

Local verification checkpoint

Run the smallest example from this chapter in a local workspace and record the package version, runtime, data path, and observed output. If the result depends on model size, vector count, CPU/GPU backend, or available memory, note that constraint beside the exercise so the lesson remains reproducible.