08. Multi-Model Gateway

Chapter 8 of 18 · 20 min

Local verification checkpoint

Run the smallest example from this chapter in a local workspace and record the package version, runtime, data path, and observed output. If the result depends on model size, vector count, CPU/GPU backend, or available memory, note that constraint beside the exercise so the lesson remains reproducible.

EXERCISE

Create a gateway that routes requests for "llama3.2" to one mock endpoint and "mistral" to a different mock endpoint. Verify that sending requests with different model identifiers reaches the correct destination.