12. Health Checks

Chapter 12 of 18 · 15 min

KEY INSIGHT

Health endpoints let orchestration systems verify readinessΓÇöseparate liveness probes from readiness checks to enable graceful degradation. Kubernetes uses health checks to manage pod lifecycle. Liveness probes determine whether a container should be restarted. Readiness probes determine whether a container can receive traffic. These probes must return quickly and accurately reflect the service's ability to function. A naive health endpoint simply returns 200. This passes when the server starts but provides no information about downstream dependencies. A realistic health check verifies database connectivity, cache availability, and external API reachability before reporting healthy status. ```python from fastapi import FastAPI from pydantic import BaseModel import asyncpg import aioredis class HealthStatus(BaseModel): status: str checks: dict app = FastAPI() async def check_database() -> dict: try: pool = app.state.db_pool async with pool.acquire() as conn: result = await conn.fetchval("SELECT 1") return {"database": {"status": "healthy", "latency_ms": 0}} except Exception as exc: return {"database": {"status": "unhealthy", "error": str(exc)}} async def check_cache() -> dict: try: redis = app.state.redis latency_start = datetime.now() await redis.ping() latency = (datetime.now() - latency_start).total_seconds() * 1000 return {"cache": {"status": "healthy", "latency_ms": round(latency, 1)}} except Exception as exc: return {"cache": {"status": "unhealthy", "error": str(exc)}} @app.get("/health/live") async def liveness(): return HealthStatus(status="alive", checks={}) @app.get("/health/ready") async def readiness(): checks = {} checks.update(await check_database()) checks.update(await check_cache()) unhealthy = [k for k, v in checks.items() if v.get("status") == "unhealthy"] if unhealthy: return JSONResponse( status_code=503, content=HealthStatus( status="unhealthy", checks=checks ).model_dump() ) return HealthStatus(status="healthy", checks=checks) ``` Liveness endpoints return immediately with no dependency checks. A slow liveness probe causes Kubernetes to restart containers unnecessarily. Readiness endpoints perform thorough checks and return 503 when dependencies fail, signaling that traffic should be routed elsewhere. Monitor health endpoint latency in production. A health check taking more than 100ms suggests resource contention or connection pool exhaustion. Alert on prolonged slowness before it impacts actual request handling.

EXERCISE

Add a custom health check that verifies an OpenAI-compatible API endpoint responds within acceptable latency. Include the check results in the readiness endpoint and return unhealthy status when the external API exceeds 500ms.