HOW-TO · INF

How to set temperature to zero for deterministic responses

beginner5 minBy Fredoline Eruo
PREREQUISITES

Ollama installed

What this does

Setting temperature to 0 makes the model always choose the most probable next token. This produces reproducible, consistent outputs ideal for factual queries, code generation, and testing.

Steps

  1. Set temperature to 0 via API.

    curl -s http://localhost:11434/api/generate \
      -d '{"model": "llama3.2", "prompt": "What is 2+2?",
           "options": {"temperature": 0}, "stream": false}' \
      | jq -r '.response'
    
  2. Add a fixed seed for full reproducibility. Some models support seed to make sampling deterministic even at non-zero temperatures.

    curl -s http://localhost:11434/api/generate \
      -d '{"model": "llama3.2", "prompt": "Explain gravity",
           "options": {"temperature": 0, "seed": 42}, "stream": false}'
    
  3. Set in Ollama interactive session.

    ollama run llama3.2
    /set parameter temperature 0
    
  4. Verify determinism by running the same prompt twice.

    # First call
    curl -s ... -d '{"options":{"temperature":0}}' | jq -r '.response' > out1.txt
    # Second call
    curl -s ... -d '{"options":{"temperature":0}}' | jq -r '.response' > out2.txt
    # Compare
    fc out1.txt out2.txt
    

Verification

fc out1.txt out2.txt
# Expected output: "FC: no differences encountered" — both responses are byte-for-byte identical

Common failures

  • Outputs still differ at temperature 0: Some models have non-deterministic CUDA kernels. Set environment variable CUBLAS_WORKSPACE_CONFIG=:4096:8 for CUDA determinism.
  • seed parameter ignored: Not all models support seed. Check model documentation or use temperature=0 alone.
  • Too deterministic for creative tasks: Use temperature 0 only for factual recall, code, and math. Creative tasks need non-zero temperature.

Operator checkpoint

Before treating this as solved, write down the local runtime, model or package version, hardware/backend if relevant, and the verification output. This keeps the guide useful as a Will-It-Run style decision instead of a one-off command transcript.

Operator checkpoint

Before treating this as solved, write down the local runtime, model or package version, hardware/backend if relevant, and the verification output. This keeps the guide useful as a Will-It-Run style decision instead of a one-off command transcript.

Related guides

RELATED GUIDES