01. The Fundamental Insight
Most prompt engineering tutorials focus on phrasing: "be more concise" or "use bullet points." This misses the core problem. Language models predict the next token. Everything you write—including instructions, context, and examples—shapes the probability distribution of what comes next.
Consider what happens when you write: "Give me a summary of this document." The model generates tokens that statistically tend to be summaries. But "summary" is ambiguous: it could mean three sentences or three paragraphs, formal or casual, focusing on conclusions or methods. The model picks one based on what appears frequently in training data.
Prompt engineering controls the probability distribution directly. Instead of hoping the model interprets "summary" correctly, you specify the structure:
Write a three-sentence summary that:
1. Starts with the main finding
2. Includes one quantitative result
3. Ends with the implication
Document text: [your text here]
This prompt constrains the output space. The model still chooses tokens, but the valid choices are now limited to patterns matching your specification.
The failure mode is asking for implicit behavior. "Summarize it" relies on the model having learned what "summarize" means in contexts similar to yours. If your use case differs from training data distribution (domain-specific terminology, unusual format requirements), the model may produce unexpected results. Explicit structure prevents this.
Local verification checkpoint
Run the smallest example from this chapter in a local workspace and record the package version, runtime, data path, and observed output. If the result depends on model size, vector count, CPU/GPU backend, or available memory, note that constraint beside the exercise so the lesson remains reproducible.
Local verification checkpoint
Run the smallest example from this chapter in a local workspace and record the package version, runtime, data path, and observed output. If the result depends on model size, vector count, CPU/GPU backend, or available memory, note that constraint beside the exercise so the lesson remains reproducible.
Take a prompt you currently use (for any task) and rewrite it with explicit structural requirements: number of items, format, starting word, length constraints. Run it three times and note the variation.