05. The Economics
Cloud AI Costs: The Unseen Ledger
Cloud AI services aren't free to run—they're just free to you (up to a point). Understanding the real costs explains why pricing changes and helps you evaluate the local vs. cloud tradeoff.
OpenAI's costs (approximate, per published data):
- GPT-4o: ~$2.50/1M tokens input, ~$10/1M tokens output
- GPT-4o-mini: ~$0.15/1M tokens input, ~$0.60/1M tokens output
Anthropic's costs (approximate):
- Claude 3.5 Sonnet: ~$3/1M tokens input, ~$15/1M tokens output
What does this mean in practice?
A typical conversation might use 50,000 tokens input (your conversation history) + 5,000 tokens output. For GPT-4o:
- Input: 50,000 / 1,000,000 × $2.50 = $0.125
- Output: 5,000 / 1,000,000 × $10 = $0.05
- Total: ~$0.175 per response
Heavy use adds up. Someone doing 20 substantive conversations a day, 7 days a week: ~$25/month in token costs alone (ignoring subscription fees).
Local AI Costs: The One-Time Purchase
Local AI has a different cost structure: upfront hardware investment, then essentially free use.
Hardware examples (current US pricing):
| Configuration | Hardware | Cost |
|---|---|---|
| CPU-only | 32GB RAM, modern CPU | $400-600 |
| Budget GPU | RTX 3060 12GB | $300-400 + system |
| Mid-range GPU | RTX 4070 12GB | $600-800 + system |
| High-end GPU | RTX 4090 24GB | $1,600-1,800 + system |
Once you have the hardware, running local AI costs only electricity—roughly $0.10-0.20/day with heavy use (depending on GPU power draw and your electricity rate).
Break-even analysis:
If you're currently paying $20/month for cloud AI, a $400 hardware investment breaks even in 20 months. If you're paying $100/month, it breaks even in 4 months.
The crossover point depends on your usage. Heavy users break even quickly. Light users may never recover the hardware cost—but they gain privacy and offline capability.
The Hidden Costs
Local AI has costs that aren't obvious:
Time cost: Setup and configuration takes 2-4 hours for first-timers. Ongoing maintenance is minimal, but initial investment is real.
Opportunity cost: You could spend that time using cloud AI. For some people, that's the right choice.
Hardware ceiling: Some tasks require capabilities your hardware can't provide. You might need to use cloud for certain things anyway.
Making the Decision
Use cloud when:
- You need the absolute best model quality for a one-off task
- You don't have the hardware and don't want to buy it
- The task is non-sensitive and you don't care about privacy
Use local when:
- Privacy matters (documents, conversations, data you don't want leaving your control)
- You use AI heavily (daily, many conversations)
- You want offline capability
- You want to customize behavior, system prompts, or parameters
Both: Many users do both. Light casual use goes to cloud. Sensitive tasks go local.
Calculate your current cloud AI usage: how many conversations do you have per day, and roughly how long are they? Multiply by estimated token counts ($0.001-0.01 per conversation depending on model). How much are you spending per month? Now compare that to a $400 GPU that lasts 3 years: what does monthly cloud cost need to be for local to save money?