ONNX Runtime Mobile

Microsoft's mobile/edge variant of ONNX Runtime. The reference path for Snapdragon X / Lunar Lake / Ryzen AI on Windows + Copilot+ PC NPU acceleration. Mobile builds drop ops not used in inference to keep binary size small.

By Fredoline Eruo·Last verified May 7, 2026·16,000 GitHub stars

Overview

Pros

First-class Windows Copilot+ PC NPU support
Microsoft-maintained — ships with Windows AI features
DirectML provider on Windows; CoreML on iOS; NNAPI on Android

Cons

Toolchain assumes ONNX intermediate format — Hugging Face → ONNX conversion is an extra step
iOS path narrower than CoreML or MLC LLM
Android NNAPI path lags MLC LLM on LLM benchmarks

Compatibility

Operating systems	Android iOS Windows
GPU backends	NPU DirectML CoreML
License	Open source · free + open-source

Runtime health

Operator-grade signals on how actively ONNX Runtime Mobile is being maintained, how fresh its measurements are, and what failure classes operators have flagged. Every label below is anchored to a real date or count — we never infer maintainer activity we can't show.

Release cadence

Derived from the most recent editorial signal on this row.

Active

Updated May 7, 2026

6 days since last refresh · source: lastUpdated

Benchmark freshness

How recent the editorial measurements on this runtime are.

0editorial benchmarks

No editorial benchmarks for this runtime yet.

Community reproduction

Submissions that match an editorial measurement on similar hardware.

0reproduced reports

No community reproductions on file yet.

Get ONNX Runtime Mobile

Official site

https://onnxruntime.ai/docs/tutorials/mobile/

GitHub

https://github.com/microsoft/onnxruntime

Frequently asked

Is ONNX Runtime Mobile free?

ONNX Runtime Mobile has a paid tier (free + open-source). Check the pricing page for current terms.

What operating systems does ONNX Runtime Mobile support?

ONNX Runtime Mobile supports Android, iOS, Windows.

Which GPUs work with ONNX Runtime Mobile?

ONNX Runtime Mobile supports NPU, DirectML, CoreML. CPU-only inference is also possible but slow.

Reviewed by RunLocalAI Editorial. See our editorial policy for how we evaluate tools.

Related — keep moving

Compare hardware

Buyer guides

When it doesn't work

Recommended hardware

Alternatives

MLX-LM ExLlamaV2 llama.cpp Llamafile Ollama IPEX-LLM CTranslate2 Intel OpenVINO

Before you buy

Verify ONNX Runtime Mobile runs on your specific hardware before committing money.

Will it run on my hardware? →Custom hardware comparison →GPU recommender (4 questions) →