HOW-TO · RAG

How to Implement Self-Querying Retrieval in RAG

advanced30 minBy Fredoline Eruo
Target environment
Ubuntu 24.04 · Ollama 0.4.x
PREREQUISITES

LangChain installed, vector store with metadata

What this does

A self-querying retriever inspects the user's question, extracts any structured filters such as date ranges or categories, and applies them during retrieval. This allows natural-language queries such as "reports from 2024" or "docs about security published by the legal team" to automatically trigger metadata filters without manual filter construction.

Steps

  1. Import self-querying components. The SelfQueryRetriever requires a schema and LLM for filter extraction.

    import os
    os.environ["OLLAMA_BASE_URL"] = "http://localhost:11434"
    
    from langchain_ollama import ChatOllama, OllamaEmbeddings
    from langchain_community.vectorstores import Chroma
    from langchain.chains.query_constructor.base import AttributeInfo, FunctionalMetadatas
    
  2. Define metadata attributes. Provide the LLM with field names, types, and descriptions.

    metadata_field_info = [
        AttributeInfo(
            name="date",
            description="Publication date of the document in YYYY-MM-DD format",
            type="string",
        ),
        AttributeInfo(
            name="category",
            description="Topic category, one of: engineering, legal, finance",
            type="string",
        ),
        AttributeInfo(
            name="author",
            description="Author name",
            type="string",
        ),
    ]
    
  3. Load documents with metadata. Ensure each document carries structured fields.

    from langchain_community.document_loaders import JSONLoader
    
    loader = JSONLoader("context/documents.json", jq_schema=".[]", content_key="text")
    docs = loader.load()
    for doc in docs:
        doc.metadata = {
            "date": doc.metadata.get("date", "2024-01-01"),
            "category": doc.metadata.get("category", "engineering"),
            "author": doc.metadata.get("author", "unknown"),
        }
    
  4. Create the self-querying retriever. Point it at the vector store and LLM.

    from langchain.retrievers.self_query.base import SelfQueryRetriever
    
    embeddings = OllamaEmbeddings(model="llama3")
    db = Chroma.from_documents(docs, embeddings)
    llm = ChatOllama(model="llama3")
    
    retriever = SelfQueryRetriever.from_llm(
        llm,
        db,
        document_content_description="Technical document",
        metadata_field_info=metadata_field_info,
    )
    
  5. Query with automatic filters. The LLM extracts filters from natural language.

    results = retriever.invoke("Show me legal documents from 2024")
    for r in results:
        print(r.page_content, r.metadata)
    

    Expected output: documents whose category is "legal" and whose date starts with "2024".

Verification

python -c "
from langchain.chains.query_constructor.base import AttributeInfo
info = AttributeInfo(name='date', description='date field', type='string')
print(info.name)
# Expected: date
"

Common failures

  • LLM failing to extract filters. If the model misinterprets the query, adjust the description fields in AttributeInfo to be more explicit.
  • Metadata missing in documents. Every document must carry the defined fields; missing keys cause retrieval errors.
  • Filter values not matching schema. Date formats, category spellings, and author names must exactly match the defined metadata; inconsistent casing breaks filtering.
  • Self-query returning no results. Test the filter schema independently before wiring it into the retriever.
  • Version mismatch - The installed package or runtime differs from the command shown; check the version first and rerun the smallest verification command.
  • Local environment drift - Another service, virtual environment, model, or path is being used; print the active binary path and configuration before changing the guide steps.

Related guides

RELATED GUIDES