RUNLOCALAIv38
->Will it run?Best GPUCompareTroubleshootStartLearnPulseModelsHardwareToolsBench
Run check
RUNLOCALAI

Independently operated catalog for local-AI hardware and software. Hand-written verdicts. Source-cited claims. Reproducible commands when we have them.

OP·Fredoline Eruo
DIR
  • Models
  • Hardware
  • Tools
  • Benchmarks
TOOLS
  • Will it run?
  • Compare hardware
  • Cost vs cloud
  • Choose my GPU
  • Prompting kits
  • Quick answers
REF
  • All buyer guides
  • Learn local AI
  • Methodology
  • Glossary
  • Errors KB
  • Trust
EDITOR
  • About
  • Author
  • How we make money
  • Editorial policy
  • Contact
LEGAL
  • Privacy
  • Terms
  • Sitemap
MAIL · MONTHLY DIGEST
Get monthly local AI changes
Monthly recap. No spam.
DISCLOSURE

Some links on this site are affiliate links (Amazon Associates and other first-class retailers). When you buy through them, we earn a small commission at no extra cost to you. Affiliate links do not influence our verdicts — there are cards we rate highly that we don't have affiliate relationships with, and cards that sell well that we refuse to recommend. Read more →

© 2026 runlocalai.coIndependently operated
RUNLOCALAI · v38
  1. >
  2. Home
  3. /Learn
  4. /Courses
  5. /Data Analysis with Local AI
  6. /Ch. 4
Data Analysis with Local AI

04. Text-to-SQL

Chapter 4 of 18 · 25 min
KEY INSIGHT

Text-to-SQL reduces the technical barrier to data access, but generated queries require validation against actual schema and execution results.

Converting natural language questions to SQL queries enables non-technical users to access data directly. Local models can generate SQL without data leaving the environment, addressing privacy concerns that prevent using external services for query generation.

Text-to-SQL involves understanding the question, mapping concepts to table structures, and constructing valid SQL syntax. Local models can learn schema details and generate queries that respect table relationships and constraints.

Schema-Aware Query Generation

Effective query generation requires the model to understand table structures, column meanings, and relationships. Providing schema context significantly improves query accuracy.

import ollama
import sqlite3

def generate_sql(question: str, schema: str, examples: list = None) -> str:
    """Generate SQL query from natural language question."""
    
    prompt = f"""Given this database schema:
    {schema}
    
    Generate a SQL query for this question:
    {question}
    
    Return ONLY the SQL query, no explanation."""
    
    if examples:
        prompt = f"Examples of correct queries:\n{examples}\n\n{prompt}"
    
    response = ollama.chat(
        model='llama3.2',
        messages=[{'role': 'user', 'content': prompt}]
    )
    
    return response['message']['content'].strip()

# Example schema
schema = """
CREATE TABLE customers (
    id INTEGER PRIMARY KEY,
    name TEXT,
    signup_date DATE,
    region TEXT
);

CREATE TABLE orders (
    id INTEGER PRIMARY KEY,
    customer_id INTEGER REFERENCES customers(id),
    order_date DATE,
    total DECIMAL(10,2),
    status TEXT
);
"""

# Generate query
question = "How many customers signed up in each region last month?"
sql = generate_sql(question, schema)
print(sql)

The model should return something like:

SELECT c.region, COUNT(*) as customer_count
FROM customers c
WHERE c.signup_date >= DATE_TRUNC('month', CURRENT_DATE - INTERVAL '1 month')
  AND c.signup_date < DATE_TRUNC('month', CURRENT_DATE)
GROUP BY c.region

Validating Generated Queries

Models sometimes generate syntactically valid but semantically incorrect SQL. Validation catches these errors before they cause problems.

def validate_sql(sql: str, conn: sqlite3.Connection) -> dict:
    """Validate SQL query syntax and execution."""
    
    result = {
        'valid': False,
        'error': None,
        'row_count': 0,
        'sample_results': None
    }
    
    try:
        cursor = conn.cursor()
        cursor.execute(sql)
        
        # Verify execution succeeds
        columns = [desc[0] for desc in cursor.description]
        rows = cursor.fetchmany(5)  # Fetch sample results
        
        result['valid'] = True
        result['columns'] = columns
        result['sample_results'] = rows
        
    except Exception as e:
        result['error'] = str(e)
    
    return result

# Test with SQLite
conn = sqlite3.connect(':memory:')
conn.execute("CREATE TABLE test (a INTEGER, b TEXT)")
conn.execute("INSERT INTO test VALUES (1, 'x'), (2, 'y')")

# Validate a generated query
validation = validate_sql("SELECT * FROM test", conn)
print(f"Valid: {validation['valid']}")
print(f"Columns: {validation.get('columns')}")

Handling Query Failures

Common failure modes include incorrect table references, missing JOIN conditions, wrong aggregation logic, and misunderstood column meanings. Debugging involves comparing generated SQL against expected results and providing corrective feedback.

def debug_sql(question: str, sql: str, error: str, schema: str) -> str:
    """Debug failed SQL generation."""
    
    prompt = f"""Question: {question}
    Generated SQL: {sql}
    Error: {error}
    Schema: {schema}
    
    What is wrong with this query? Provide corrected SQL with explanation."""
    
    response = ollama.chat(
        model='llama3.2',
        messages=[{'role': 'user', 'content': prompt}]
    )
    
    return response['message']['content']

Iterative debugging with specific error feedback produces progressively better queries.

EXERCISE

Create a small SQLite database with 3-4 related tables. Generate queries for 10 different questions. Identify which queries fail and why. Build a prompt template that improves generation accuracy.

← Chapter 3
Automated Data Profiling
Chapter 5 →
Natural Language Queries