How to implement token-level logging for LLM prompt auditing and compliance
LLM with token-level access, compliance logging target
What this does
This guide captures every input and output token exchanged with an LLM, along with metadata for each interaction, into an append-only audit log. The log records the exact token sequences sent and received, model version, user identifier, timestamp, and session context. This enables post-hoc compliance audits, data-subject access requests, and forensic analysis of model behavior for regulated environments.
Steps
Define the audit record schema. Each record must contain:
audit_record = { "event_id": str(uuid.uuid4()), "timestamp": datetime.utcnow().isoformat(), "session_id": session_id, "user_id": user_id, "model": model_name, "model_version": model_version, "input_tokens": input_tokens, "output_tokens": output_tokens, "input_hash": hashlib.sha256(json.dumps(input_tokens).encode()).hexdigest(), "output_hash": hashlib.sha256(json.dumps(output_tokens).encode()).hexdigest(), "request_duration_ms": request_duration_ms, "stop_reason": stop_reason, }Capture token sequences. If the API returns token-level data, store the full sequence. If only token counts are available, log the text with a hash for integrity:
audit_record["input_text_hash"] = hashlib.sha256(prompt_text.encode()).hexdigest() audit_record["output_text_hash"] = hashlib.sha256(response_text.encode()).hexdigest()Write records to an append-only S3 prefix with object lock. Configure the bucket:
aws s3api put-object-lock-configuration --bucket audit-logs --object-lock-configuration '{"ObjectLockEnabled":"Enabled","Rule":{"DefaultRetention":{"Mode":"GOVERNANCE","Days":2555}}}'Implement an atomic write function:
async def write_audit_record(record): key = f"audit/{record['timestamp'][:10]}/{record['event_id']}.json" session = aioboto3.Session() async with session.client("s3") as s3: await s3.put_object( Bucket="audit-logs", Key=key, Body=json.dumps(record), ContentType="application/json", ObjectLockMode="GOVERNANCE", ObjectLockRetainUntilDate=datetime.utcnow() + timedelta(days=2555) )Add a synchronous fallback for deployment environments without asyncio support. Use
boto3withThreadPoolExecutorfor non-blocking behavior.Verify audit trail integrity. Write a validation script that recomputes hashes:
for record in fetch_audit_records(start_date, end_date): expected_hash = hashlib.sha256(json.dumps(record["input_tokens"]).encode()).hexdigest() assert expected_hash == record["input_hash"], f"Integrity violation at {record['event_id']}"Run the integrity check as a cron job:
0 6 * * * python validate_audit.py --days 1 >> /var/log/audit-validation.log 2>&1
Verification
aws s3 ls s3://audit-logs/audit/$(date -u +%Y-%m-%d)/ --recursive | wc -l
Expected output: number of audit records written today, greater than 0.
Common failures
- Duplicate audit records on retry — generate a deterministic
event_idby hashing the request body, or use idempotency keys from the calling service. - S3 object lock fails — the bucket must have object lock enabled at creation time. It cannot be retroactively enabled. Create a new bucket with
aws s3api create-bucket --bucket audit-logs --object-lock-enabled-for-bucket. - Token-level data not available from API — some LLM providers do not expose per-token output. In this case, store the text representation with a cryptographic hash as a substitute.
- High write costs at scale — batch audit records into larger objects (one per minute) to reduce S3 PUT costs. Ensure batch writes complete within the retention window.