09. Personality Configuration
System prompts control the chatbot's personality. Ollama accepts a messages array where the first message has role: "system". Store the system prompt in the session and prepend it to every request:
DEFAULT_SYSTEM = "You are a helpful, concise assistant. Answer in plain text, no markdown unless requested."
@app.post("/chat")
async def chat(session_id: str, model: str, messages: list[dict], system_prompt: str = DEFAULT_SYSTEM):
if session_id not in sessions:
sessions[session_id] = []
# Inject system prompt at the start
full_messages = [{"role": "system", "content": system_prompt}] + sessions[session_id]
def stream():
from app.ollama_client import stream_chat
for chunk in stream_chat(model, full_messages):
yield chunk
return StreamingResponse(stream(), media_type="text/event-stream")
On the frontend, add a system prompt textarea to the settings panel:
<label>System Prompt:</label>
<textarea id="systemPrompt" rows="3">You are a helpful, concise assistant.</textarea>
Send it with each request:
const systemPrompt = document.getElementById("systemPrompt").value;
const response = await fetch(`/chat?session_id=${sessionId}&model=${model}&system_prompt=${encodeURIComponent(systemPrompt)}`, { ... });
Temperature controls randomness. Add a slider:
<label>Temperature: <span id="tempVal">0.7</span></label>
<input type="range" id="tempSlider" min="0" max="2" step="0.1" value="0.7" />
Pass it in the payload and update stream_chat to include temperature in the request body.
Local verification checkpoint
Run the smallest example from this chapter in a local workspace and record the package version, runtime, data path, and observed output. If the result depends on model size, vector count, CPU/GPU backend, or available memory, note that constraint beside the exercise so the lesson remains reproducible.
Local verification checkpoint
Run the smallest example from this chapter in a local workspace and record the package version, runtime, data path, and observed output. If the result depends on model size, vector count, CPU/GPU backend, or available memory, note that constraint beside the exercise so the lesson remains reproducible.
Create a "Code Assistant" persona with the system prompt: You are a senior software engineer. Write clean, tested code with comments. Prefer idiomatic Python. Test it against a general-purpose prompt.