Paste anything — a prompt, source file, transcript, email draft. We count tokens, then show you the cost across 11 cloud providers and your local rig, side by side. Same data for the same input. Decide where each workload belongs.
11 providers across 3 tiers. Prices verified May 2026. Token counts are content-aware approximations (±15%) — provider price differences dwarf tokenizer accuracy, so this is enough to make a decision. URL updates as you change — share by copy.
Start with the sample text or paste your own. Pick a local rig + model from your catalog and the cloud providers you want to compare against. Everything is reactive.
| Provider | Model | $/M in | $/M out | Per run | per month | vs cheapest |
|---|---|---|---|---|---|---|
| DeepSeek | DeepSeek V3 (API) Mid-tier | $0.27 | $1.10 | $0.00012 | $0.177 | cheapest |
| Together AI | Llama 3.3 70B (Together) Open-weight | $0.88 | $0.88 | $0.00019 | $0.284 | 1.6× |
| Anthropic | Claude Sonnet 4.5 Frontier | $3.00 | $15.00 | $0.00151 | $2.26 | 12.8× |
| OpenAI | GPT-5 Frontier | $5.00 | $20.00 | $0.00216 | $3.23 | 18.3× |
Once you know it's worth going local: pick the whole rig + runtime + model + install script in one wizard.
The local cost above assumes Q4_K_M. Drill into which quant fits your specific hardware × model × context.
The break-even number above is volume-driven. Switch view to tune utilization / electricity / amortization assumptions.
Cost is one side of the trade — speed is the other. Watch tokens stream at the rate your rig would produce them, side-by-side with the cloud provider.
Step-by-step guide to swap Claude Sonnet for a local model in your Claude Code workflow. Real commands.