How to build a cost tracking dashboard for AI usage
AI API with billing data, dashboard framework (Grafana/Streamlit)
What this does
A cost tracking dashboard provides visibility into AI spending across models, users, and projects. It collects billing data via API, stores it in a time-series database, and renders interactive charts with Streamlit or Grafana, enabling engineering and finance teams to monitor consumption and set budget alerts.
Steps
Set up a Python script to fetch cost data from the AI provider's billing endpoint at regular intervals. Use the requests library with an API key stored in environment variables loaded via python-dotenv. Build a structured JSON payload containing fields for model, user_id, project_id, tokens_used, timestamp, and cost_usd. Install InfluxDB as the time-series store and create a measurement called ai_costs with tags for model, user_id, and project_id, and fields for tokens and cost. Write a function that reads each API response and writes a data point with the appropriate timestamp. Schedule ingestion with APScheduler running every 15 minutes.
Create a dashboard.py file importing streamlit, pandas, and the InfluxDB client. Write a query function that pulls aggregated cost data grouped by model, user, or project for a configurable date range. Use st.selectbox to let the viewer switch between grouping dimensions. Render results with st.bar_chart for hourly and daily trends and st.metric for total spend. Store dashboard configuration in a TOML file to avoid hardcoding thresholds.
Implement a notification layer that triggers when daily spend exceeds a configured limit. Pull the threshold from the config file and compare it to the latest cost aggregation. When a breach occurs, log a warning and send an email via SMTP or a webhook to a Slack channel. This check runs inside the same scheduler loop as the data ingestion step.
Record the local run evidence. Save the exact command, runtime or package version, model name if applicable, and observed output so the result can be reproduced later.
Confirm the local starting state. Print the active binary, package version, model name, or configuration path before changing the workflow.
Run the smallest complete path. Execute the minimum command or script that proves the guide works end to end on the local machine.
Compare against expected output. Check the final line, status code, generated artifact, or model response against the verification section before expanding the setup.
Record the local run evidence. Save the exact command, runtime or package version, model name if applicable, and observed output so the result can be reproduced later.
Verification
Run the ingestion script and launch the dashboard:
python ingest_costs.py
streamlit run dashboard.py
Navigate to http://localhost:8501. Expected output: a total cost metric card at the top of the page; a bar chart showing cost broken down by the selected grouping dimension; a date range selector that updates all charts interactively; and a console log entry showing "Alert triggered" when simulated spend exceeds the threshold.
Common failures
- Missing API credentials: The billing endpoint returns a
401 Unauthorizedresponse if the API key is absent or expired. Always validate environment variables before the first request usingos.getenv()with a fallback error. - InfluxDB write failures: Data points with duplicate timestamps overwrite existing entries silently. Use
epochprecision and unique write keys paired with timestamps to avoid ambiguity. - Dashboard crashes on empty data: Calling chart functions with
Noneor empty DataFrames raises aTypeError. Guard results with a conditional check and display anst.infomessage when no data is present. - Timezone mismatches: InfluxDB stores timestamps in UTC by default. A mismatch between the ingestion script's local timezone and the database causes off-by-hours errors in reports. Normalize all timestamps to UTC before writing.
- Rate limiting on billing API: Repeated requests without backoff trigger
429 Too Many Requests. Implement exponential backoff with thetenacitylibrary.
Related guides
- Implement usage-based billing for AI products — connects directly by providing the cost data that feeds billing invoices.
- Build a real-time AI monitoring dashboard — extends cost tracking to live monitoring with Prometheus metrics.