DeepSeek V4 ships open weights — frontier reasoning at MoE serving cost
▼ WHAT HAPPENED
DeepSeek released V4 model weights under their permissive commercial-friendly license. The model is a Mixture-of-Experts at trillion-parameter total scale with ~37-50B active parameters per token. Capability lands within 20 ELO points of GPT-5 mini on reasoning benchmarks (AIME, GPQA, math contest data). Same architectural family as V3 with ~30% improvement on hard reasoning + better multilingual coverage.
▼ OPERATOR ANGLE
**For self-hosting**: minimum 4× [H100 SXM](/hardware/nvidia-h100-sxm) for FP8 production serving, or 1× [MI300X](/hardware/amd-mi300x) (192 GB) at Q3-Q4 for single-card deployment. Frontier-tier hardware required.
**For cloud rental**: $30-60/hr per node on Runpod / Lambda for 8× H100 cluster. Compare against Claude 3.7 Sonnet API at sustained workloads — self-host wins above ~150 QPS.
**For evaluation**: don't deploy without running your specific workload through both V3 and V4. The reasoning quality jump matters most for math/code/multi-step planning. For general chat, V3 Lite at lower serving cost may still be the right pick.
See [DeepSeek V4 verdict](/models/deepseek-v4) for full deployment math, [DeepSeek family hub](/families/deepseek) for context across the lineage.