minicpm

4B parameters

Commercial OK

Reviewed May 2026

MiniCPM 3 4B

OpenBMB's edge-optimized 4B. MIT license; designed for phone deployment. Strong reasoning per parameter.

License: MIT·Released Sep 12, 2024·Context: 32,768 tokens

Overview

OpenBMB's edge-optimized 4B. MIT license; designed for phone deployment. Strong reasoning per parameter.

Strengths

MIT license
Phone-deployable
Strong reasoning per param

Weaknesses

4B ceiling limits open-ended generation depth

Quantization variants

Each quantization trades model quality for file size and VRAM. Q4_K_M is the most popular starting point.

Quantization	File size	VRAM required
Q4_K_M	2.4 GB	4 GB

Get the model

HuggingFace

Original weights

huggingface.co/openbmb/MiniCPM3-4B

Source repository — direct quantization required.

Hardware that runs this

Cards with enough VRAM for at least one quantization of MiniCPM 3 4B.

Frequently asked

What's the minimum VRAM to run MiniCPM 3 4B?

4GB of VRAM is enough to run MiniCPM 3 4B at the Q4_K_M quantization (file size 2.4 GB). Higher-quality quantizations need more.

Can I use MiniCPM 3 4B commercially?

Yes — MiniCPM 3 4B ships under the MIT, which permits commercial use. Always read the license text before deployment.

What's the context length of MiniCPM 3 4B?

MiniCPM 3 4B supports a context window of 32,768 tokens (about 33K).

Source: huggingface.co/openbmb/MiniCPM3-4B

Reviewed by RunLocalAI Editorial. See our editorial policy for how we research and verify model claims.

Related — keep moving

Compare hardware

Buyer guides

When it doesn't work

Recommended hardware

Before you buy

Verify MiniCPM 3 4B runs on your specific hardware before committing money.

Will it run on my hardware? →Custom hardware comparison →GPU recommender (4 questions) →