How to implement few-shot example selection dynamically
Vector store with embedded examples, embedding model
What this does
Static few-shot examples limit model performance across diverse queries. Dynamic few-shot selection retrieves the most relevant examples from a vector store at runtime based on the query's semantic similarity, improving accuracy on domain-specific tasks.
Steps
Step 1: Install and set up dependencies
pip install faiss-cpu sentence-transformers
faiss-cpu provides a fast approximate nearest-neighbor index. sentence-transformers generates semantic embeddings for queries and examples.
Step 2: Define the FewShotSelector
from sentence_transformers import SentenceTransformer
import numpy as np
from typing import List, Dict, Optional
class FewShotSelector:
"""Dynamically select relevant few-shot examples from a vector store."""
def __init__(
self,
embedder: SentenceTransformer,
index_path: Optional[str] = None,
examples_path: Optional[str] = None,
top_k: int = 3,
min_score: float = 0.3
):
"""
Args:
embedder: Sentence embedding model for encoding queries and examples.
index_path: Path to a saved FAISS index. None for new index.
examples_path: Path to JSON file storing example metadata.
top_k: Number of examples to retrieve per query.
min_score: Minimum similarity score threshold (0-1).
"""
self.embedder = embedder
self.top_k = top_k
self.min_score = min_score
self.examples: List[Dict] = []
self.index = None
if examples_path:
self._load_examples(examples_path)
if index_path:
self._load_index(index_path)
def _load_examples(self, path: str):
import json
with open(path, "r") as f:
self.examples = json.load(f)
print(f"[FewShotSelector] Loaded {len(self.examples)} examples.")
def _load_index(self, path: str):
import faiss
self.index = faiss.read_index(path)
print(f"[FewShotSelector] Loaded FAISS index with {self.index.ntotal} vectors.")
def build_index(self, examples: List[Dict], index_path: str = None, examples_path: str = None):
"""
Build a FAISS index from a list of example dictionaries.
Each example must have 'input' and 'output' fields.
"""
self.examples = examples
texts = [f"{e['input']} -> {e['output']}" for e in examples]
embeddings = self.embedder.encode(texts, show_progress_bar=True)
dimension = embeddings.shape[1]
self.index = faiss.IndexFlatL2(dimension)
self.index.add(np.array(embeddings).astype("float32"))
if index_path:
import faiss as _faiss
_faiss.write_index(self.index, index_path)
if examples_path:
import json
with open(examples_path, "w") as f:
json.dump(self.examples, f)
print(f"[FewShotSelector] Built index with {self.index.ntotal} vectors, dim={dimension}.")
def select(self, query: str, default_examples: List[Dict] = None) -> List[Dict]:
"""
Retrieve the top-k most semantically similar examples for a query.
Args:
query: The user's input query.
default_examples: Examples to use if retrieval yields low scores.
Returns:
List of selected example dictionaries.
"""
query_embedding = self.embedder.encode([query]).astype("float32")
scores, indices = self.index.search(np.array(query_embedding).astype("float32"), self.top_k)
selected = []
for score, idx in zip(scores[0], indices[0]):
if idx < len(self.examples):
example = self.examples[idx].copy()
example["similarity_score"] = float(score)
selected.append(example)
# Filter by minimum score threshold
filtered = [e for e in selected if e["similarity_score"] >= self.min_score]
# Fall back to defaults if no examples meet the threshold
if not filtered and default_examples:
return default_examples[:self.top_k]
return filtered[:self.top_k]
Step 3: Build and populate the example store
# Initialize embedder and selector
embedder = SentenceTransformer("all-MiniLM-L6-v2")
selector = FewShotSelector(embedder, top_k=3, min_score=0.3)
# Define domain-specific examples for sentiment classification
examples = [
{"input": "The product quality exceeded all expectations.", "output": "positive"},
{"input": "Delivery was three weeks late and damaged.", "output": "negative"},
{"input": "Works exactly as described in the specifications.", "output": "positive"},
{"input": "The interface is confusing and counter-intuitive.", "output": "negative"},
{"input": "Customer support resolved the issue within an hour.", "output": "positive"},
{"input": "The battery lasts less than two hours on a full charge.", "output": "negative"},
{"input": "Setup was straightforward and the app is intuitive.", "output": "positive"},
{"input": "The return process required seven emails and two phone calls.", "output": "negative"},
]
# Build the vector index
selector.build_index(examples, index_path="sentiment.index", examples_path="sentiment_examples.json")
Step 4: Inject selected examples into prompts
def build_few_shot_prompt(query: str, examples: List[Dict]) -> str:
"""Build a prompt with dynamically selected few-shot examples."""
example_lines = []
for ex in examples:
example_lines.append(f"Input: {ex['input']}")
example_lines.append(f"Output: {ex['output']}")
example_lines.append("")
examples_text = "\n".join(example_lines)
return (
"Classify the sentiment of the following input as 'positive' or 'negative'.\n\n"
f"Examples:\n{examples_text}\n\n"
f"Input: {query}\n"
"Output:"
)
# Test dynamic selection
query = "The packaging arrived dented but the item works perfectly."
selected = selector.select(query)
print(f"Query: {query}")
print(f"Selected {len(selected)} examples:")
for ex in selected:
print(f" - [{ex['similarity_score']:.3f}] {ex['input']} -> {ex['output']}")
prompt = build_few_shot_prompt(query, selected)
print(f"\nFinal prompt:\n{prompt}")
Step 5: Verify expected output
Expected output format:
Query: The packaging arrived dented but the item works perfectly.
Selected 3 examples:
- [0.847] The product quality exceeded all expectations. -> positive
- [0.712] Works exactly as described in the specifications. -> positive
- [0.631] The battery lasts less than two hours on a full charge. -> negative
The output must contain:
- A
similarity_scorefor each selected example (higher = more relevant). - The
inputandoutputfields preserved from the original example. - The final prompt contains all selected examples followed by the query and no
{{...}}placeholders.
For the default example fallback, pass a non-matching query (e.g., "unrelated topic xyz") and verify the selector returns the provided default_examples when similarity scores fall below 0.3.
Verification
Common failures
Embedding dimension mismatch between index and query. If the index was built with a different embedder model (e.g., 384-dim) and the query uses a different model (e.g., 768-dim), the FAISS search raises a shape error. Always use the same
embedderinstance for bothbuild_indexandselect.Low-quality retrieval with sparse examples. An index with fewer than 20 examples produces unreliable similarity scores. Seed the index with at least 50-100 diverse examples covering the full range of expected query types before deploying to production.
Category collision in example outputs. If example outputs like "positive" and "negative" are not cleanly separated in embedding space, the selector may return examples from the wrong category. Periodically re-run selection on a validation set and measure precision per category.
- Version mismatch - The installed package or runtime differs from the command shown; check the version first and rerun the smallest verification command.
- Local environment drift - Another service, virtual environment, model, or path is being used; print the active binary path and configuration before changing the guide steps.
Related guides
- Build a Custom Prompt Template Library - Combine the few-shot selector with the template library to inject dynamically selected examples as template variables, enabling reusable dynamic prompting across agents.
- Setup Agent Error Recovery and Retry Logic - Pair dynamic few-shot selection with retry logic so that if an agent's output is flagged as low-confidence, the next retry uses an updated set of examples.