10. Multiple Model Selection
Add a model selector to the UI. When the user changes the model, the next request uses the new model. Ollama models are just strings—no registration required.
async function loadModels() {
const res = await fetch("/models");
const { models } = await res.json();
const select = document.getElementById("modelSelect");
select.innerHTML = "";
for (const m of models) {
const opt = document.createElement("option");
opt.value = m;
opt.textContent = m;
select.appendChild(opt);
}
}
loadModels();
FastAPI already serves model names from the Ollama API at GET /models. To filter models (e.g., only show 7B parameter models), you can parse the model string:
def compatible_models() -> list[str]:
all_models = list_models()
# Ollama model names are like "llama3:8b-instruct-q4_0"
# Filter to exclude quantization variants if needed
base_models = set()
for m in all_models:
base = m.split(":")[0]
base_models.add(base)
return list(base_models)
A failure mode: if a model is selected that is not pulled, Ollama returns {"error": "model not found"}. Handle it in the frontend by showing an alert: "Model not found. Run: ollama pull {selectedModel}".
Local verification checkpoint
Run the smallest example from this chapter in a local workspace and record the package version, runtime, data path, and observed output. If the result depends on model size, vector count, CPU/GPU backend, or available memory, note that constraint beside the exercise so the lesson remains reproducible.
Local verification checkpoint
Run the smallest example from this chapter in a local workspace and record the package version, runtime, data path, and observed output. If the result depends on model size, vector count, CPU/GPU backend, or available memory, note that constraint beside the exercise so the lesson remains reproducible.
Add a "Model Info" display that shows estimated size and quantization from the model name, parsed client-side.