What this does

This guide captures every input and output token exchanged with an LLM, along with metadata for each interaction, into an append-only audit log. The log records the exact token sequences sent and received, model version, user identifier, timestamp, and session context. This enables post-hoc compliance audits, data-subject access requests, and forensic analysis of model behavior for regulated environments.

Steps

Define the audit record schema. Each record must contain:

audit_record = {
    "event_id": str(uuid.uuid4()),
    "timestamp": datetime.utcnow().isoformat(),
    "session_id": session_id,
    "user_id": user_id,
    "model": model_name,
    "model_version": model_version,
    "input_tokens": input_tokens,
    "output_tokens": output_tokens,
    "input_hash": hashlib.sha256(json.dumps(input_tokens).encode()).hexdigest(),
    "output_hash": hashlib.sha256(json.dumps(output_tokens).encode()).hexdigest(),
    "request_duration_ms": request_duration_ms,
    "stop_reason": stop_reason,
}

Capture token sequences. If the API returns token-level data, store the full sequence. If only token counts are available, log the text with a hash for integrity:

audit_record["input_text_hash"] = hashlib.sha256(prompt_text.encode()).hexdigest()
audit_record["output_text_hash"] = hashlib.sha256(response_text.encode()).hexdigest()

Write records to an append-only S3 prefix with object lock. Configure the bucket:

aws s3api put-object-lock-configuration --bucket audit-logs --object-lock-configuration '{"ObjectLockEnabled":"Enabled","Rule":{"DefaultRetention":{"Mode":"GOVERNANCE","Days":2555}}}'

Implement an atomic write function:

async def write_audit_record(record):
    key = f"audit/{record['timestamp'][:10]}/{record['event_id']}.json"
    session = aioboto3.Session()
    async with session.client("s3") as s3:
        await s3.put_object(
            Bucket="audit-logs",
            Key=key,
            Body=json.dumps(record),
            ContentType="application/json",
            ObjectLockMode="GOVERNANCE",
            ObjectLockRetainUntilDate=datetime.utcnow() + timedelta(days=2555)
        )

Add a synchronous fallback for deployment environments without asyncio support. Use boto3 with ThreadPoolExecutor for non-blocking behavior.

Verify audit trail integrity. Write a validation script that recomputes hashes:

for record in fetch_audit_records(start_date, end_date):
    expected_hash = hashlib.sha256(json.dumps(record["input_tokens"]).encode()).hexdigest()
    assert expected_hash == record["input_hash"], f"Integrity violation at {record['event_id']}"

Run the integrity check as a cron job:

0 6 * * * python validate_audit.py --days 1 >> /var/log/audit-validation.log 2>&1

Verification

aws s3 ls s3://audit-logs/audit/$(date -u +%Y-%m-%d)/ --recursive | wc -l

Expected output: number of audit records written today, greater than 0.

Common failures

Duplicate audit records on retry — generate a deterministic event_id by hashing the request body, or use idempotency keys from the calling service.
S3 object lock fails — the bucket must have object lock enabled at creation time. It cannot be retroactively enabled. Create a new bucket with aws s3api create-bucket --bucket audit-logs --object-lock-enabled-for-bucket.
Token-level data not available from API — some LLM providers do not expose per-token output. In this case, store the text representation with a cryptographic hash as a substitute.
High write costs at scale — batch audit records into larger objects (one per minute) to reduce S3 PUT costs. Ensure batch writes complete within the retention window.

How to implement token-level logging for LLM prompt auditing and compliance

What this does

Steps

Verification

Common failures

Related guides