HOW-TO · INF

How to tune reasoning depth in R1 models using parameters

intermediate15 minBy Fredoline Eruo
PREREQUISITES

DeepSeek-R1 model installed

What this does

DeepSeek-R1's reasoning depth can be controlled via sampling parameters. Higher temperatures produce more exploratory chains; lower temperatures yield more direct solutions. This guide covers tuning for different problem types.

Steps

  1. Use temperature to control reasoning breadth. Lower values (0.0-0.3) produce focused reasoning; higher values (0.7-1.0) explore alternative paths.

    curl -s http://localhost:11434/api/generate \
      -d '{"model": "deepseek-r1:32b", "prompt": "Prove the Pythagorean theorem", "options": {"temperature": 0.1}}' \
      | jq -r '.response' > focused.txt
    
  2. Set max_tokens to limit reasoning length. Short reasoning tasks (basic math) need fewer tokens; open-ended problems benefit from more.

    curl -s http://localhost:11434/api/generate \
      -d '{"model": "deepseek-r1:32b", "prompt": "Is P=NP? Explain", "options": {"max_tokens": 2048}}' \
      | jq -r '.response'
    
  3. Use top_p to narrow token selection. Top_p of 0.5 forces the model to consider only high-probability tokens, shortening reasoning chains.

    curl -s http://localhost:11434/api/generate \
      -d '{"model": "deepseek-r1:32b", "prompt": "Design a sorting algorithm", "options": {"temperature": 0.4, "top_p": 0.7}}'
    
  4. Compare reasoning chain lengths across settings.

    import json, requests
    settings = [{"temperature": t, "top_p": p} for t in [0.0, 0.5, 1.0] for p in [0.5, 0.9, 1.0]]
    for opts in settings:
        r = requests.post("http://localhost:11434/api/generate",
            json={"model": "deepseek-r1:32b", "prompt": "Solve 2x+5=15", "options": opts, "stream": False})
        print(opts, "length:", len(r.json()["response"]))
    

Verification

# Compare output files
wc -w focused.txt exploratory.txt
# Expected: exploratory chains 2-5x longer than focused chains

Common failures

  • Temperature 0 still produces varied output: R1 has inherent randomness in expert routing. Set seed: 42 alongside temperature 0 for full determinism.
  • Excessive repetition at high temperature: Increase repeat_penalty to 1.2-1.5 when using temperature > 0.8.
  • Reasoning truncation: If max_tokens cuts off mid-reasoning, the answer will be missing. Always reserve enough budget for CoT + answer.

Operator checkpoint

Before treating this as solved, write down the local runtime, model or package version, hardware/backend if relevant, and the verification output. This keeps the guide useful as a Will-It-Run style decision instead of a one-off command transcript.

Related guides

RELATED GUIDES