How to set temperature parameters for creative output
Ollama installed
What this does
Temperature controls the randomness of token selection. Higher values (0.7-1.0) produce more diverse and creative outputs, making them ideal for storytelling, brainstorming, and creative writing.
Steps
Set temperature via Ollama command line.
ollama run llama3.2 /set parameter temperature 0.9Then prompt: "Write a fantasy story opening."
Set temperature via the API.
curl -s http://localhost:11434/api/generate \ -d '{"model": "llama3.2", "prompt": "Write a poem about AI", "options": {"temperature": 0.8}, "stream": false}' \ | jq -r '.response'Create a Modelfile for permanently creative defaults.
FROM llama3.2 PARAMETER temperature 0.85 PARAMETER top_p 0.95ollama create creative-llama -f ModelfileCompare outputs at different temperatures.
echo "=== Temperature 0.2 ===" curl -s ... -d '{"options":{"temperature":0.2}}' | jq -r '.response' echo "=== Temperature 0.9 ===" curl -s ... -d '{"options":{"temperature":0.9}}' | jq -r '.response'Expected: Low temperature produces predictable text; high temperature produces more varied word choices.
Verification
# Generate the same prompt 3 times at temperature 0.9
for i in 1 2 3; do
curl -s http://localhost:11434/api/generate \
-d '{"model":"llama3.2","prompt":"Name a fantasy kingdom","options":{"temperature":0.9},"stream":false}' \
| jq -r '.response'
done
# Expected: Three different kingdom names (e.g., Eldoria, Drakemoor, Silvervale)
Common failures
- Temperature too high causes gibberish: Values above 1.2 can produce incoherent output. Stay within 0.0-1.5.
- Temperature too low for creativity: Values below 0.5 restrict word choice. For creative tasks, use 0.8-1.0.
- Perplexity increase with high temperature: Higher temperature = more surprising words. For plausible creativity, combine with
top_p: 0.9.
Operator checkpoint
Before treating this as solved, write down the local runtime, model or package version, hardware/backend if relevant, and the verification output. This keeps the guide useful as a Will-It-Run style decision instead of a one-off command transcript.