How to use Prometheus remote_write for long-term AI metrics archival
Prometheus running, remote storage endpoint (Thanos/VictoriaMetrics)
What this does
This guide configures Prometheus remote_write to ship AI service metrics to a long-term storage backend for historical analysis, cost auditing, and capacity planning. Local Prometheus retains high-resolution data for 30 days while remote storage retains downsampled data for years. This enables year-over-year comparisons of token usage trends, model performance regressions, and infrastructure utilization patterns without bloating the local Prometheus TSDB.
Steps
Identify the remote storage endpoint URL. For Thanos Receive:
http://thanos-receive:19291/api/v1/receiveFor VictoriaMetrics:
http://victoriametrics:8428/api/v1/writeAdd the
remote_writeblock toprometheus.yml:remote_write: - url: "http://victoriametrics:8428/api/v1/write" queue_config: capacity: 10000 max_shards: 20 min_shards: 1 max_samples_per_send: 5000 batch_send_deadline: 5s min_backoff: 30ms max_backoff: 5s write_relabel_configs: - source_labels: [__name__] regex: "ai_.*" action: keep - source_labels: [__name__] regex: "vllm:.*" action: keepConfigure the write relabel config to filter which metrics are sent. The example above keeps only metrics with the
ai_orvllm:prefix, reducing remote storage costs by excluding irrelevant metrics.Restart Prometheus and confirm the remote write connection is established:
sudo systemctl restart prometheus curl -s http://localhost:9090/api/v1/status/config | jq '.data.remote_write[0].url'Expected output: the configured remote write URL.
Monitor remote write health from the Prometheus UI. Navigate to Status > Targets and check the remote write endpoint shows no errors. Or query:
curl -s http://localhost:9090/api/v1/query?query=prometheus_remote_storage_succeeded_samples_totalExpected output: a non-zero value confirming samples are being written.
Configure the remote storage for downsampling. In VictoriaMetrics, add deduplication and retention flags:
-dedup.minScrapeInterval=30s -retentionPeriod=24Verify historical data is queryable through the remote storage's query API. For VictoriaMetrics:
curl "http://victoriametrics:8428/api/v1/query?query=ai_token_input_total&time=$(date -d '30 days ago' +%s)"Expected output: metric data from 30 days ago if the remote write has been running that long.
Verification
curl -s http://localhost:9090/api/v1/status/tsdb | jq '.data.headStats.numSeries'
Expected output: the number of local time series. Verify this number is lower after filtering with write_relabel_configs than without.
Common failures
- Remote storage is unreachable — check network connectivity from the Prometheus host to the remote endpoint. Use
curl -v http://victoriametrics:8428/api/v1/writeto confirm the endpoint responds. - WAL grows unbounded — the remote endpoint is accepting data too slowly. Increase
max_shardsandcapacity, or reduce the number of metrics being sent viawrite_relabel_configs. - Data gaps in long-term storage — Prometheus restarts flush the WAL, potentially losing samples. Enable
send_exemplars: falseif the remote endpoint does not support exemplars. - Filter excludes important metrics — test the relabel config with a dry run. Use
promtool check config prometheus.ymland verify the intended metrics are kept.