Can the Jetson Orin Nano Super run local LLMs usefully?

Reviewed May 15, 20262 min read
jetsonedge-aismall-modelsroboticsembedded

The answer

One paragraph. No hedging beyond what the data actually warrants.

Yes — for 3B-class models in edge / robotics contexts. No for anything else.

The Jetson Orin Nano Super (the 2024 firmware update on the Orin Nano) gives you:

  • 8GB unified memory (the binding constraint)
  • ~60-100 GB/s memory bandwidth (4-5× lower than a desktop GPU)
  • ARM Linux (some software needs ARM-built wheels)
  • ~$249 retail (sometimes harder to find than that)

What runs well:

  • Llama 3.2 3B Q4_K_M — community runs put it in the "usable for an embedded assistant" range, conversational speed for short responses. Measure on your prompts.
  • Phi-3 Mini 3.8B Q4_K_M — Microsoft's edge-target model — community Jetson runs are in a similar conversational-speed range.
  • Whisper Small / Tiny — realtime CPU+GPU. Fine for embedded transcription.
  • Small embedders — nomic-embed, bge-small. Fine for edge RAG.

What doesn't work:

  • 7B+ models — fits at heavy quant but tok/s drops to chat-unusable territory.
  • Image generation (SD/Flux) — way too little VRAM.
  • Any workload where you need 50+ tok/s.

The real use cases:

  • Robotics integrations — drop an LLM into a robot for natural-language command parsing. 3B is plenty for that.
  • Edge devices — kiosk / appliance / industrial controller that needs LOCAL natural language. Air-gapped by design.
  • Solo home assistant — voice → STT → small LLM → TTS, all on a $249 box.

What it's NOT for: "I want to learn local AI without spending real money." The 8GB ceiling is too low to run anything you'd actually want to use day-to-day. A used RTX 3060 12GB at $200 used is a better learning platform.

Where we got the numbers

Orin Nano Super specs: NVIDIA Jetson product page. TPS estimates from r/LocalLLaMA and r/jetson community benchmarks May 2026.

Other questions in this thread

Other /q/ landings on the same topic — same editorial discipline.

Found this via a forum search? Bookmark the URL — we update these pages as new data lands. Have a question that should live here? Open a GitHub issue.