HOW-TO · SUP

How to implement few-shot example selection dynamically

advanced30 minBy Fredoline Eruo
Target environment
Ubuntu 24.04 · Ollama 0.4.x
PREREQUISITES

Vector store with embedded examples, embedding model

What this does

Static few-shot examples limit model performance across diverse queries. Dynamic few-shot selection retrieves the most relevant examples from a vector store at runtime based on the query's semantic similarity, improving accuracy on domain-specific tasks.

Steps

Step 1: Install and set up dependencies

pip install faiss-cpu sentence-transformers

faiss-cpu provides a fast approximate nearest-neighbor index. sentence-transformers generates semantic embeddings for queries and examples.

Step 2: Define the FewShotSelector

from sentence_transformers import SentenceTransformer
import numpy as np
from typing import List, Dict, Optional

class FewShotSelector:
    """Dynamically select relevant few-shot examples from a vector store."""

    def __init__(
        self,
        embedder: SentenceTransformer,
        index_path: Optional[str] = None,
        examples_path: Optional[str] = None,
        top_k: int = 3,
        min_score: float = 0.3
    ):
        """
        Args:
            embedder: Sentence embedding model for encoding queries and examples.
            index_path: Path to a saved FAISS index. None for new index.
            examples_path: Path to JSON file storing example metadata.
            top_k: Number of examples to retrieve per query.
            min_score: Minimum similarity score threshold (0-1).
        """
        self.embedder = embedder
        self.top_k = top_k
        self.min_score = min_score
        self.examples: List[Dict] = []
        self.index = None

        if examples_path:
            self._load_examples(examples_path)
        if index_path:
            self._load_index(index_path)

    def _load_examples(self, path: str):
        import json
        with open(path, "r") as f:
            self.examples = json.load(f)
        print(f"[FewShotSelector] Loaded {len(self.examples)} examples.")

    def _load_index(self, path: str):
        import faiss
        self.index = faiss.read_index(path)
        print(f"[FewShotSelector] Loaded FAISS index with {self.index.ntotal} vectors.")

    def build_index(self, examples: List[Dict], index_path: str = None, examples_path: str = None):
        """
        Build a FAISS index from a list of example dictionaries.
        Each example must have 'input' and 'output' fields.
        """
        self.examples = examples
        texts = [f"{e['input']} -> {e['output']}" for e in examples]
        embeddings = self.embedder.encode(texts, show_progress_bar=True)

        dimension = embeddings.shape[1]
        self.index = faiss.IndexFlatL2(dimension)
        self.index.add(np.array(embeddings).astype("float32"))

        if index_path:
            import faiss as _faiss
            _faiss.write_index(self.index, index_path)
        if examples_path:
            import json
            with open(examples_path, "w") as f:
                json.dump(self.examples, f)

        print(f"[FewShotSelector] Built index with {self.index.ntotal} vectors, dim={dimension}.")

    def select(self, query: str, default_examples: List[Dict] = None) -> List[Dict]:
        """
        Retrieve the top-k most semantically similar examples for a query.
        Args:
            query: The user's input query.
            default_examples: Examples to use if retrieval yields low scores.
        Returns:
            List of selected example dictionaries.
        """
        query_embedding = self.embedder.encode([query]).astype("float32")
        scores, indices = self.index.search(np.array(query_embedding).astype("float32"), self.top_k)

        selected = []
        for score, idx in zip(scores[0], indices[0]):
            if idx < len(self.examples):
                example = self.examples[idx].copy()
                example["similarity_score"] = float(score)
                selected.append(example)

        # Filter by minimum score threshold
        filtered = [e for e in selected if e["similarity_score"] >= self.min_score]

        # Fall back to defaults if no examples meet the threshold
        if not filtered and default_examples:
            return default_examples[:self.top_k]

        return filtered[:self.top_k]

Step 3: Build and populate the example store

# Initialize embedder and selector
embedder = SentenceTransformer("all-MiniLM-L6-v2")
selector = FewShotSelector(embedder, top_k=3, min_score=0.3)

# Define domain-specific examples for sentiment classification
examples = [
    {"input": "The product quality exceeded all expectations.", "output": "positive"},
    {"input": "Delivery was three weeks late and damaged.", "output": "negative"},
    {"input": "Works exactly as described in the specifications.", "output": "positive"},
    {"input": "The interface is confusing and counter-intuitive.", "output": "negative"},
    {"input": "Customer support resolved the issue within an hour.", "output": "positive"},
    {"input": "The battery lasts less than two hours on a full charge.", "output": "negative"},
    {"input": "Setup was straightforward and the app is intuitive.", "output": "positive"},
    {"input": "The return process required seven emails and two phone calls.", "output": "negative"},
]

# Build the vector index
selector.build_index(examples, index_path="sentiment.index", examples_path="sentiment_examples.json")

Step 4: Inject selected examples into prompts

def build_few_shot_prompt(query: str, examples: List[Dict]) -> str:
    """Build a prompt with dynamically selected few-shot examples."""
    example_lines = []
    for ex in examples:
        example_lines.append(f"Input: {ex['input']}")
        example_lines.append(f"Output: {ex['output']}")
        example_lines.append("")

    examples_text = "\n".join(example_lines)
    return (
        "Classify the sentiment of the following input as 'positive' or 'negative'.\n\n"
        f"Examples:\n{examples_text}\n\n"
        f"Input: {query}\n"
        "Output:"
    )

# Test dynamic selection
query = "The packaging arrived dented but the item works perfectly."
selected = selector.select(query)

print(f"Query: {query}")
print(f"Selected {len(selected)} examples:")
for ex in selected:
    print(f"  - [{ex['similarity_score']:.3f}] {ex['input']} -> {ex['output']}")

prompt = build_few_shot_prompt(query, selected)
print(f"\nFinal prompt:\n{prompt}")

Step 5: Verify expected output

Expected output format:

Query: The packaging arrived dented but the item works perfectly.
Selected 3 examples:
  - [0.847] The product quality exceeded all expectations. -> positive
  - [0.712] Works exactly as described in the specifications. -> positive
  - [0.631] The battery lasts less than two hours on a full charge. -> negative

The output must contain:

  • A similarity_score for each selected example (higher = more relevant).
  • The input and output fields preserved from the original example.
  • The final prompt contains all selected examples followed by the query and no {{...}} placeholders.

For the default example fallback, pass a non-matching query (e.g., "unrelated topic xyz") and verify the selector returns the provided default_examples when similarity scores fall below 0.3.

Verification

Common failures

  1. Embedding dimension mismatch between index and query. If the index was built with a different embedder model (e.g., 384-dim) and the query uses a different model (e.g., 768-dim), the FAISS search raises a shape error. Always use the same embedder instance for both build_index and select.

  2. Low-quality retrieval with sparse examples. An index with fewer than 20 examples produces unreliable similarity scores. Seed the index with at least 50-100 diverse examples covering the full range of expected query types before deploying to production.

  3. Category collision in example outputs. If example outputs like "positive" and "negative" are not cleanly separated in embedding space, the selector may return examples from the wrong category. Periodically re-run selection on a validation set and measure precision per category.

  • Version mismatch - The installed package or runtime differs from the command shown; check the version first and rerun the smallest verification command.
  • Local environment drift - Another service, virtual environment, model, or path is being used; print the active binary path and configuration before changing the guide steps.

Related guides

  • Build a Custom Prompt Template Library - Combine the few-shot selector with the template library to inject dynamically selected examples as template variables, enabling reusable dynamic prompting across agents.
  • Setup Agent Error Recovery and Retry Logic - Pair dynamic few-shot selection with retry logic so that if an agent's output is flagged as low-confidence, the next retry uses an updated set of examples.