RUNLOCALAIv38
->Will it run?Best GPUCompareTroubleshootStartLearnPulseModelsHardwareToolsBench
Run check
RUNLOCALAI

Independently operated catalog for local-AI hardware and software. Hand-written verdicts. Source-cited claims. Reproducible commands when we have them.

OP·Fredoline Eruo
DIR
  • Models
  • Hardware
  • Tools
  • Benchmarks
TOOLS
  • Will it run?
  • Compare hardware
  • Cost vs cloud
  • Choose my GPU
  • Prompting kits
  • Quick answers
REF
  • All buyer guides
  • Learn local AI
  • Methodology
  • Glossary
  • Errors KB
  • Trust
EDITOR
  • About
  • Author
  • How we make money
  • Editorial policy
  • Contact
LEGAL
  • Privacy
  • Terms
  • Sitemap
MAIL · MONTHLY DIGEST
Get monthly local AI changes
Monthly recap. No spam.
DISCLOSURE

Some links on this site are affiliate links (Amazon Associates and other first-class retailers). When you buy through them, we earn a small commission at no extra cost to you. Affiliate links do not influence our verdicts — there are cards we rate highly that we don't have affiliate relationships with, and cards that sell well that we refuse to recommend. Read more →

© 2026 runlocalai.coIndependently operated
RUNLOCALAI · v38
  1. >
  2. Home
  3. /Learn
  4. /Courses
  5. /Prompt Engineering Fundamentals
  6. /Ch. 21
Prompt Engineering Fundamentals

21. Automated Optimization

Chapter 21 of 25 · 20 min
KEY INSIGHT

Automated optimization works when evaluation is fast and objective—optimizing for style or naturalness requires human feedback that automation cannot replicate. ```python class AutomatedPromptOptimizer: def __init__(self, base_prompt, model, evaluator): self.model = model self.evaluator = evaluator self.base_prompt = base_prompt self.history = [] def generate_variant(self, current_prompt, feedback): """Generate improved variant based on evaluator feedback.""" improvement_prompt = f"""Given this prompt: {current_prompt} --- And this evaluation feedback: --- {feedback} --- Generate an improved version of the prompt that addresses the feedback. Changes should be specific, not vague rewording. Output only the new prompt, no explanation.""" variant = self.model.generate(improvement_prompt) return variant def optimize(self, test_cases, max_iterations=10, threshold=0.95): """ Iterative optimization loop. Returns best prompt when threshold met or iterations exhausted. """ current_prompt = self.base_prompt best_score = 0 for iteration in range(max_iterations): # Evaluate current state scores = self.evaluator.evaluate(current_prompt, test_cases) current_score = scores['avg_correctness'] self.history.append({ 'iteration': iteration, 'prompt': current_prompt, 'score': current_score }) if current_score >= threshold: print(f"Threshold reached at iteration {iteration}") return current_prompt, self.history # Generate feedback for improvement feedback = self.evaluator.detailed_feedback(current_prompt, test_cases) # Check for score stagnation if iteration > 2 and self.history[-1]['score'] == self.history[-2]['score']: feedback += " Consider structural changes, not rewording." # Generate and test variant variant = self.generate_variant(current_prompt, feedback) variant_score = self.evaluator.evaluate(variant, test_cases)['avg_correctness'] # Accept improvement, keep current on regression if variant_score > current_score: current_prompt = variant best_score = variant_score else: self.history.append({ 'iteration': iteration, 'prompt': f"<REJECTED: score={variant_score}>", 'score': variant_score }) return current_prompt, self.history ``` **Failure mode:** Optimization converges to local maxima that exploit evaluation blind spots. A prompt that includes test case answers as hints within instructions will score 100% on evaluation while failing on unseen inputs. Countermeasure: held-out test cases not used during optimization. ```python def split_test_cases(all_cases, holdout_ratio=0.2): """Reserve test cases for final validation only.""" import random random.shuffle (all_cases) split_point = int(len(all_cases) * (1 - holdout_ratio)) return { 'development': all_cases[:split_point], 'holdout': all_cases[split_point:] } # Optimization uses only development set # Final report shows scores on both sets # Discrepancy > 10% indicates evaluation exploitation ``` Automated optimization typically yields 5–15% improvement over manually-written baseline prompts within 10 iterations. Gains plateau after 15 iterations in most cases—additional iterations rarely produce proportional improvement.

Automated prompt optimization uses meta-prompting to improve prompts without manual iteration. The system generates prompt variants, evaluates them against test cases, and selects improvements iteratively.


EXERCISE

Implement automated prompt optimization for a classification task. Split test cases with 20% holdout. Run 10 iterations of optimization and report improvement on both development and holdout sets. Document any evaluation exploitation discrepancies.

← Chapter 20
A/B Testing Prompts
Chapter 22 →
Building a Prompt Kit