KEY INSIGHT
Usage metering tracks resource consumption at the API call level, enabling accurate billing and preventing abuse before it impacts margins.
AI API costs are measured in tokens. Every request to an AI model consumes tokens, which translates directly to costs from model providers plus your margin. Without accurate metering, billing becomes guesswork and margin erosion goes unnoticed.
The metering system records each API call with sufficient granularity to reconstruct billing reports and detect anomalies.
```python
from sqlalchemy import Column, String, Integer, DateTime, Text
from datetime import datetime
class UsageRecord(Base):
__tablename__ = "usage_records"
id = Column(String(36), primary_key=True)
tenant_id = Column(String(36), nullable=False, index=True)
workspace_id = Column(String(36), nullable=False, index=True)
api_key_id = Column(String(36), nullable=False, index=True)
# Request details
model_name = Column(String(100), nullable=False)
request_tokens = Column(Integer, nullable=False, default=0)
response_tokens = Column(Integer, nullable=False, default=0)
total_tokens = Column(Integer, nullable=False)
# Cost tracking (in kobo for Naira)
cost_kobo = Column(Integer, nullable=False)
# Timestamps
created_at = Column(DateTime, default=datetime.utcnow, index=True)
# Request hash for deduplication
request_hash = Column(String(64), nullable=False)
class UsageMeter:
def __init__(self, db: Session):
self.db = db
def record_usage(
self,
tenant_id: str,
workspace_id: str,
api_key_id: str,
model_name: str,
request_tokens: int,
response_tokens: int,
cost_kobo: int,
request_hash: str
) -> UsageRecord:
"""Record usage for a single API call."""
# Check for duplicate (idempotency)
existing = self.db.query(UsageRecord).filter_by(
request_hash=request_hash
).first()
if existing:
return existing
record = UsageRecord(
id=str(uuid.uuid4()),
tenant_id=tenant_id,
workspace_id=workspace_id,
api_key_id=api_key_id,
model_name=model_name,
request_tokens=request_tokens,
response_tokens=response_tokens,
total_tokens=request_tokens + response_tokens,
cost_kobo=cost_kobo,
request_hash=request_hash
)
self.db.add(record)
self.db.commit()
return record
```
Deduplication matters. Network failures cause clients to retry requests. Without deduplication via request hashing, retry attempts get billed multiple times. The `request_hash` should be a hash of the unique request identifier sent by the client, or a hash of (timestamp + model + truncated prompt) for client-generated requests.
```python
import hashlib
import json
def generate_request_hash(
api_key_id: str,
model_name: str,
prompt: str,
timestamp: datetime
) -> str:
"""Generate deterministic hash for deduplication."""
# Use truncated prompt to limit hash input size
payload = {
"key": api_key_id,
"model": model_name,
"prompt": prompt[:500], # First 500 chars
"ts": timestamp.isoformat()
}
return hashlib.sha256(json.dumps(payload, sort_keys=True).encode()).hexdigest()
```