Can I use VibeThinker-3B commercially?

Yes — VibeThinker-3B ships under the MIT, which permits commercial use. Always read the license text before deployment.

What's the context length of VibeThinker-3B?

VibeThinker-3B supports a context window of 131,072 tokens (about 131K).

VibeThinker-3B — local inference guide

VibeThinker-3B

VibeThinker-3B is a compact open-weight reasoning model from WeiboAI (Sina Weibo), fine-tuned from Qwen2.5-Coder-3B (Hugging Face `WeiboAI/VibeThinker-3B`, 2026-06). It is a dense ~3B-parameter model (runs in roughly 6.7GB VRAM) specialized for verifiable math, coding, and STEM reasoning — not a general-purpose assistant. `config.json` exposes a 131,072-token context, though the authors describe a ~64K effective training window. MIT-licensed (commercial use permitted). Author-reported benchmarks (arXiv 2606.16140) cite strong AIME / LiveCodeBench / IMO-AnswerBench scores — vendor-reported and not independently verified. Its small size makes it the most consumer- and edge-local reasoning model of the June 2026 batch.

License: MIT·Released Jun 12, 2026·Context: 131,072 tokens

Positioning

VibeThinker-3B is the standout small model of June 2026 — a dense ~3B reasoning specialist from WeiboAI, fine-tuned from Qwen2.5-Coder-3B, under a clean MIT license. Its entire pitch is verifiable reasoning (math, coding, STEM) on hardware anyone owns.

What stands out

It runs in ~6.7 GB VRAM — a single 8 GB consumer GPU, or even CPU — yet the authors report frontier-class math/coding scores (AIME, LiveCodeBench, IMO-AnswerBench) for its size. For offline, air-gapped, or edge reasoning, a 3B MIT model that does real chain-of-thought is genuinely useful where a 70B will not fit. This is the most consumer- and edge-local item of the month — check whether your GPU clears it on will-it-run.

Honest caveats

It is a specialist, not a generalist — no tool/agent training, weaker on broad knowledge (GPQA-Diamond ~70). All scores are author-reported (arXiv 2606.16140) and third-party-unverified. The config exposes 128K context but the authors describe a ~64K effective training window, so treat very-long-context use cautiously.

Verdict

Run it if you want local, offline, verifiable math/coding reasoning on a single consumer GPU and you keep it to its lane. Do not expect a general assistant or agentic tool use. For its size and license it is the easiest "reasoning model on my own machine" entry point in the catalog — pair it with Ollama and a small quant.

Overview

VibeThinker-3B

Our verdict

Positioning

What stands out

Honest caveats

Verdict

Overview

Strengths

Weaknesses

Quantization variants

Get the model

HuggingFace

Hardware that runs this

Models worth comparing

Frequently asked

Can I use VibeThinker-3B commercially?

What's the context length of VibeThinker-3B?

Related — keep moving