Error Handling — First Local Chatbot (Chapter 11)

Four error categories will surface during development:

1. Ollama connection errors. httpx.ConnectError when Ollama is not running. Handle by returning a structured error response:

@app.post("/chat")
async def chat(...):
    try:
        # ... streaming logic
    except httpx.ConnectError:
        return JSONResponse({"error": "Ollama is not running. Start it with: ollama serve"}, status_code=503)
    except httpx.TimeoutException:
        return JSONResponse({"error": "Ollama timed out. The model may be slow or overloaded."}, status_code=504)

2. Model not found. Ollama returns a JSON error body. Parse it from the SSE stream:

for line in resp.iter_lines():
    if line:
        try:
            data = json.loads(line)
            if data.get("error"):
                yield f"data: {json.dumps({'error': data['error']})}\n\n"
                return
        except json.JSONDecodeError:
            pass

3. SSE format errors. The frontend will receive malformed chunks if the backend sends data without the data: prefix or the double newline. Add a simple validation:

if (!line.startsWith("data: ")) continue;

4. Session not found. If a session ID does not exist, return an empty history rather than a 404:

@app.get("/session/{session_id}")
def get_session(session_id: str):
    return {"history": sessions.get(session_id, [])}

Log all errors to stdout with timestamps for debugging.

Local verification checkpoint

Run the smallest example from this chapter in a local workspace and record the package version, runtime, data path, and observed output. If the result depends on model size, vector count, CPU/GPU backend, or available memory, note that constraint beside the exercise so the lesson remains reproducible.