08. Multi-Model Gateway
Chapter 8 of 18 · 20 min
Local verification checkpoint
Run the smallest example from this chapter in a local workspace and record the package version, runtime, data path, and observed output. If the result depends on model size, vector count, CPU/GPU backend, or available memory, note that constraint beside the exercise so the lesson remains reproducible.
EXERCISE
Create a gateway that routes requests for "llama3.2" to one mock endpoint and "mistral" to a different mock endpoint. Verify that sending requests with different model identifiers reaches the correct destination.