HOW-TO · SUP

How to build a cost tracking dashboard for AI usage

intermediate30 minBy Fredoline Eruo
Target environment
Ubuntu 24.04 · Ollama 0.4.x
PREREQUISITES

AI API with billing data, dashboard framework (Grafana/Streamlit)

What this does

A cost tracking dashboard provides visibility into AI spending across models, users, and projects. It collects billing data via API, stores it in a time-series database, and renders interactive charts with Streamlit or Grafana, enabling engineering and finance teams to monitor consumption and set budget alerts.

Steps

Set up a Python script to fetch cost data from the AI provider's billing endpoint at regular intervals. Use the requests library with an API key stored in environment variables loaded via python-dotenv. Build a structured JSON payload containing fields for model, user_id, project_id, tokens_used, timestamp, and cost_usd. Install InfluxDB as the time-series store and create a measurement called ai_costs with tags for model, user_id, and project_id, and fields for tokens and cost. Write a function that reads each API response and writes a data point with the appropriate timestamp. Schedule ingestion with APScheduler running every 15 minutes.

Create a dashboard.py file importing streamlit, pandas, and the InfluxDB client. Write a query function that pulls aggregated cost data grouped by model, user, or project for a configurable date range. Use st.selectbox to let the viewer switch between grouping dimensions. Render results with st.bar_chart for hourly and daily trends and st.metric for total spend. Store dashboard configuration in a TOML file to avoid hardcoding thresholds.

Implement a notification layer that triggers when daily spend exceeds a configured limit. Pull the threshold from the config file and compare it to the latest cost aggregation. When a breach occurs, log a warning and send an email via SMTP or a webhook to a Slack channel. This check runs inside the same scheduler loop as the data ingestion step.

  • Record the local run evidence. Save the exact command, runtime or package version, model name if applicable, and observed output so the result can be reproduced later.

  • Confirm the local starting state. Print the active binary, package version, model name, or configuration path before changing the workflow.

  • Run the smallest complete path. Execute the minimum command or script that proves the guide works end to end on the local machine.

  • Compare against expected output. Check the final line, status code, generated artifact, or model response against the verification section before expanding the setup.

  • Record the local run evidence. Save the exact command, runtime or package version, model name if applicable, and observed output so the result can be reproduced later.

Verification

Run the ingestion script and launch the dashboard:

python ingest_costs.py
streamlit run dashboard.py

Navigate to http://localhost:8501. Expected output: a total cost metric card at the top of the page; a bar chart showing cost broken down by the selected grouping dimension; a date range selector that updates all charts interactively; and a console log entry showing "Alert triggered" when simulated spend exceeds the threshold.

Common failures

  • Missing API credentials: The billing endpoint returns a 401 Unauthorized response if the API key is absent or expired. Always validate environment variables before the first request using os.getenv() with a fallback error.
  • InfluxDB write failures: Data points with duplicate timestamps overwrite existing entries silently. Use epoch precision and unique write keys paired with timestamps to avoid ambiguity.
  • Dashboard crashes on empty data: Calling chart functions with None or empty DataFrames raises a TypeError. Guard results with a conditional check and display an st.info message when no data is present.
  • Timezone mismatches: InfluxDB stores timestamps in UTC by default. A mismatch between the ingestion script's local timezone and the database causes off-by-hours errors in reports. Normalize all timestamps to UTC before writing.
  • Rate limiting on billing API: Repeated requests without backoff trigger 429 Too Many Requests. Implement exponential backoff with the tenacity library.

Related guides

RELATED GUIDES