Natural Language Queries — Data Analysis with Local AI (Chapter 5)

Beyond SQL generation, natural language interfaces provide conversational data exploration. Users ask follow-up questions, request clarifications, and refine queries through dialogue. Implementing this requires managing conversation context and handling ambiguous requests gracefully.

Building a Query Interface

A complete natural language query interface handles multiple turns, maintains context, and provides clear responses. The implementation combines text-to-SQL with response formatting and error handling.

import ollama
import sqlite3
from dataclasses import dataclass
from typing import Optional

@dataclass
class QueryContext:
    """Manages conversation context for natural language queries."""
    
    schema: str
    history: list = None
    
    def __post_init__(self):
        if self.history is None:
            self.history = []
    
    def add_turn(self, question: str, sql: str, result: str):
        self.history.append({
            'question': question,
            'sql': sql,
            'result': result
        })
    
    def get_context_prompt(self) -> str:
        context = f"Schema:\n{self.schema}\n\n"
        context += "Previous queries in this session:\n"
        for turn in self.history[-3:]:  # Last 3 turns
            context += f"Q: {turn['question']}\nSQL: {turn['sql']}\n\n"
        return context

def natural_language_query(
    question: str, 
    ctx: QueryContext, 
    conn: sqlite3.Connection
) -> dict:
    """Process natural language question and return results."""
    
    # Build context-aware prompt
    context = ctx.get_context_prompt()
    prompt = f"""{context}
    
    Current question: {question}
    
    Generate SQL query. Return ONLY the SQL query."""
    
    # Generate query
    response = ollama.chat(
        model='llama3.2',
        messages=[{'role': 'user', 'content': prompt}]
    )
    
    sql = response['message']['content'].strip()
    
    # Execute and format results
    try:
        df = pd.read_sql_query(sql, conn)
        ctx.add_turn(question, sql, "Success")
        return {
            'sql': sql,
            'success': True,
            'results': df.to_dict('records'),
            'row_count': len(df)
        }
    except Exception as e:
        ctx.add_turn(question, sql, f"Error: {e}")
        return {
            'sql': sql,
            'success': False,
            'error': str(e)
        }

# Usage example
schema = """CREATE TABLE sales (id, product, amount, date)"""
ctx = QueryContext(schema=schema)

# First question
result1 = natural_language_query(
    "What are total sales by product?",
    ctx,
    conn
)
print(f"SQL: {result1['sql']}")
print(f"Results: {result1.get('row_count', 0)} rows")

# Follow-up question
result2 = natural_language_query(
    "Show the same for last quarter",
    ctx,
    conn
)
print(f"SQL: {result2['sql']}")

The context mechanism allows follow-up questions to reference previous queries without repeating details. The model understands "show the same" refers to the previous breakdown.

Handling Ambiguity

Natural language contains inherent ambiguity. "Show sales" might mean total sales, sales per day, or sales per region. Resolving ambiguity requires either asking for clarification or making reasonable assumptions and explaining them.

def resolve_ambiguity(question: str, candidates: list, ctx: QueryContext) -> str:
    """Handle ambiguous questions by proposing clarifications."""
    
    prompt = f"""Question: {question}
    Possible interpretations:
    {chr(10).join(f"- {c}" for c in candidates)}
    
    If ambiguous, ask a clarifying question. If clear, state the assumption being made."""
    
    response = ollama.chat(
        model='llama3.2',
        messages=[{'role': 'user', 'content': prompt}]
    )
    
    return response['message']['content']

# Example ambiguity handling
question = "Show me sales trends"
candidates = [
    "Daily sales over time",
    "Monthly sales comparison by region",
    "Product category sales comparison"
]
clarification = resolve_ambiguity(question, candidates, ctx)
print(clarification)

When ambiguity exists, the response asks a specific question to resolve it rather than picking arbitrarily. Users appreciate being asked rather than receiving unexpected results.