What this does

DeepSeek-R1's reasoning depth can be controlled via sampling parameters. Higher temperatures produce more exploratory chains; lower temperatures yield more direct solutions. This guide covers tuning for different problem types.

Steps

Use temperature to control reasoning breadth. Lower values (0.0-0.3) produce focused reasoning; higher values (0.7-1.0) explore alternative paths.

curl -s http://localhost:11434/api/generate \
  -d '{"model": "deepseek-r1:32b", "prompt": "Prove the Pythagorean theorem", "options": {"temperature": 0.1}}' \
  | jq -r '.response' > focused.txt

Set max_tokens to limit reasoning length. Short reasoning tasks (basic math) need fewer tokens; open-ended problems benefit from more.

curl -s http://localhost:11434/api/generate \
  -d '{"model": "deepseek-r1:32b", "prompt": "Is P=NP? Explain", "options": {"max_tokens": 2048}}' \
  | jq -r '.response'

Use top_p to narrow token selection. Top_p of 0.5 forces the model to consider only high-probability tokens, shortening reasoning chains.

curl -s http://localhost:11434/api/generate \
  -d '{"model": "deepseek-r1:32b", "prompt": "Design a sorting algorithm", "options": {"temperature": 0.4, "top_p": 0.7}}'

Compare reasoning chain lengths across settings.

import json, requests
settings = [{"temperature": t, "top_p": p} for t in [0.0, 0.5, 1.0] for p in [0.5, 0.9, 1.0]]
for opts in settings:
    r = requests.post("http://localhost:11434/api/generate",
        json={"model": "deepseek-r1:32b", "prompt": "Solve 2x+5=15", "options": opts, "stream": False})
    print(opts, "length:", len(r.json()["response"]))

Verification

# Compare output files
wc -w focused.txt exploratory.txt
# Expected: exploratory chains 2-5x longer than focused chains

Common failures

Temperature 0 still produces varied output: R1 has inherent randomness in expert routing. Set seed: 42 alongside temperature 0 for full determinism.
Excessive repetition at high temperature: Increase repeat_penalty to 1.2-1.5 when using temperature > 0.8.
Reasoning truncation: If max_tokens cuts off mid-reasoning, the answer will be missing. Always reserve enough budget for CoT + answer.

Operator checkpoint

Before treating this as solved, write down the local runtime, model or package version, hardware/backend if relevant, and the verification output. This keeps the guide useful as a Will-It-Run style decision instead of a one-off command transcript.

How to tune reasoning depth in R1 models using parameters

What this does

Steps

Verification

Common failures

Operator checkpoint

Related guides