Nemotron 3 Ultra (550B-A55B)

NVIDIA Nemotron 3 Ultra (550B-A55B) is a frontier-scale open-weight reasoning model from NVIDIA (Hugging Face `nvidia/NVIDIA-Nemotron-3-Ultra-550B-A55B-BF16`, 2026-06), with 550B total / 55B active parameters. Its "LatentMoE" architecture interleaves Mamba-2, MoE, and select attention layers with multi-token-prediction, trained on an NVFP4 recipe; it supports up to 1M-token context and 10 languages. Released under the Linux Foundation OpenMDW License v1.1 (commercial use permitted), text-only. Vendor-reported benchmarks (via NVIDIA's Nemo Evaluator SDK) include SWE-bench Verified 70.7, GPQA (no tools) 87.0, and RULER@1M 94.7 — not independently verified. The smaller Nemotron 3 Nano and Super tiers are far more practical to run on local hardware.

License: OpenMDW-1.1·Released Jun 4, 2026·Context: 1,000,000 tokens

Positioning

Nemotron 3 Ultra (550B-A55B) is NVIDIA's frontier-scale open-weight reasoning model (June 2026) — and notably the most capable open model from a US lab, released fully open (weights + recipes) under the Linux Foundation's OpenMDW-1.1 commercial license.

What stands out

The architecture is the interesting part: a "LatentMoE" hybrid interleaving Mamba-2, MoE, and select attention layers with multi-token prediction, trained on an NVFP4 recipe. The NVFP4 checkpoints make a 550B model unusually tractable on Blackwell hardware, and 1M context is supported (NVFP4-on-Blackwell; 262K in BF16). A fully-open US-lab model with permissive licensing is rare and valuable for teams that want provenance.

Honest caveats

All benchmarks are NVIDIA-reported (via their Nemo Evaluator SDK) and not independently verified. Per Artificial Analysis it still trails the leading Chinese open models on the intelligence index. At 550B it is datacenter-class; the smaller Nemotron 3 Super (120B) and Nemotron 3 Nano (30B) tiers are the locally-practical choices for most readers.

Verdict

Run it if you need a fully-open, commercially-licensed frontier reasoner with documented training recipes and you have Blackwell / NVFP4 infrastructure. Most local users should start with Nemotron 3 Super or Nano instead — same family, same openness, hardware you can actually buy.

Overview

Frequently asked

Can I use Nemotron 3 Ultra (550B-A55B) commercially?

Yes — Nemotron 3 Ultra (550B-A55B) ships under the OpenMDW-1.1, which permits commercial use. Always read the license text before deployment.

What's the context length of Nemotron 3 Ultra (550B-A55B)?

Nemotron 3 Ultra (550B-A55B) supports a context window of 1,000,000 tokens (about 1000K).

Nemotron 3 Ultra (550B-A55B)

Our verdict

Positioning

What stands out

Honest caveats

Verdict

Overview

Strengths

Weaknesses

Quantization variants

Get the model

HuggingFace

Hardware that runs this

Models worth comparing

Frequently asked

Can I use Nemotron 3 Ultra (550B-A55B) commercially?

What's the context length of Nemotron 3 Ultra (550B-A55B)?

Related — keep moving