llama
70B parameters
Commercial OK

Llama 3.1 Nemotron 70B Instruct

NVIDIA's HelpSteer2-tuned Llama 3.1 70B. Topped Arena Hard at release. The pre-Nemotron-3 NVIDIA reference open weights.

License: Llama 3.1 Community License·Released Oct 15, 2024·Context: 131,072 tokens

Overview

NVIDIA's HelpSteer2-tuned Llama 3.1 70B. Topped Arena Hard at release. The pre-Nemotron-3 NVIDIA reference open weights.

Strengths

  • Top instruction-following at release
  • HelpSteer2 tuning

Weaknesses

  • Now historical
  • 48GB+ VRAM

Quantization variants

Each quantization trades model quality for file size and VRAM. Q4_K_M is the most popular starting point.

QuantizationFile sizeVRAM required
Q4_K_M40.0 GB48 GB

Get the model

Ollama

One-line install

ollama run nemotron:70bRead our Ollama review →

HuggingFace

Original weights

huggingface.co/nvidia/Llama-3.1-Nemotron-70B-Instruct

Source repository — direct quantization required.

Hardware that runs this

Cards with enough VRAM for at least one quantization of Llama 3.1 Nemotron 70B Instruct.

Compare alternatives

Models worth comparing

Same parameter band, plus what's one tier above and below — so you can decide what actually fits your hardware.

Frequently asked

What's the minimum VRAM to run Llama 3.1 Nemotron 70B Instruct?

48GB of VRAM is enough to run Llama 3.1 Nemotron 70B Instruct at the Q4_K_M quantization (file size 40.0 GB). Higher-quality quantizations need more.

Can I use Llama 3.1 Nemotron 70B Instruct commercially?

Yes — Llama 3.1 Nemotron 70B Instruct ships under the Llama 3.1 Community License, which permits commercial use. Always read the license text before deployment.

What's the context length of Llama 3.1 Nemotron 70B Instruct?

Llama 3.1 Nemotron 70B Instruct supports a context window of 131,072 tokens (about 131K).

How do I install Llama 3.1 Nemotron 70B Instruct with Ollama?

Run `ollama pull nemotron:70b` to download, then `ollama run nemotron:70b` to start a chat session. The default quantization is Q4_K_M.

Source: huggingface.co/nvidia/Llama-3.1-Nemotron-70B-Instruct

Reviewed by RunLocalAI Editorial. See our editorial policy for how we research and verify model claims.