05. Report Generation
Chapter 5 of 18 · 15 min
Report generation combines data retrieval, AI synthesis, and formatting. The pattern is extract data, feed context to the model, generate narrative sections, and format output.
Start with data extraction from your sources:
import csv
from datetime import datetime, timedelta
def extract_sales_data(days=7):
"""Extract sales data for the reporting period."""
# Simulated data extraction from CSV
start_date = datetime.now() - timedelta(days=days)
sales_data = []
with open('sales_log.csv', 'r') as f:
reader = csv.DictReader(f)
for row in reader:
row_date = datetime.strptime(row['date'], '%Y-%m-%d')
if row_date >= start_date:
sales_data.append({
'date': row['date'],
'product': row['product'],
'amount': float(row['amount']),
'region': row['region'],
'customer': row['customer']
})
return sales_data
def summarize_data(sales_data):
"""Create summary statistics for report context."""
if not sales_data:
return {'total': 0, 'count': 0, 'average': 0}
total = sum(s['amount'] for s in sales_data)
return {
'total': total,
'count': len(sales_data),
'average': total / len(sales_data),
'by_region': group_by(sales_data, 'region'),
'by_product': group_by(sales_data, 'product')
}
def group_by(data, key):
"""Group data by a field."""
groups = {}
for item in data:
value = item[key]
if value not in groups:
groups[value] = {'total': 0, 'count': 0}
groups[value]['total'] += item['amount']
groups[value]['count'] += 1
return groups
The model receives structured data plus natural language context about what to highlight:
REPORT_GENERATION_PROMPT = """Generate a business report based on the following data.
REPORTING PERIOD: {period}
KEY METRICS:
- Total Revenue: ${total:,.2f}
- Transaction Count: {count}
- Average Transaction: ${average:,.2f}
REGIONAL BREAKDOWN:
{regional_breakdown}
PRODUCT BREAKDOWN:
{product_breakdown}
Write a professional business report covering:
1. Executive summary (2-3 sentences)
2. Key highlights and trends
3. Regional performance notes
4. Notable patterns or anomalies
Keep the report concise, approximately 300 words. Use professional tone.
"""
def generate_report(sales_data, period_description):
"""Generate a formatted business report."""
summary = summarize_data(sales_data)
regional_breakdown = "\n".join(
f"- {region}: ${data['total']:,.2f} ({data['count']} transactions)"
for region, data in summary['by_region'].items()
)
product_breakdown = "\n".to_string(
f"- {product}: ${data['total']:,.2f} ({data['count']} transactions)"
for product, data in summary['by_product'].items()
)
prompt = REPORT_GENERATION_PROMPT.format(
period=period_description,
total=summary['total'],
count=summary['count'],
average=summary['average'],
regional_breakdown=regional_breakdown,
product_breakdown=product_breakdown
)
response = chat(model='llama3.1:8b', messages=[
{'role': 'user', 'content': prompt}
])
return response['message']['content']
Output can go to various formats: plain text, markdown, HTML, or PDF through converters. A practical approach is generating markdown and converting to HTML or PDF as needed.
EXERCISE
Identify a report you currently generate manually. Extract the raw data and test automated report generation. Compare quality to your manual version.