Qwen3 Swallow 32B RL v0.2
A 32B Japanese-English model built on Qwen3, trained with continual pre-training, supervised fine-tuning, and reinforcement learning with verifiable rewards. The RL stage targets math, coding, and general reasoning. This is v0.2, so the training pipeline has had at least one revision.
If you need a commercially-licensed Japanese model that can actually handle math and code — not just chat — this is worth a look at the 32B tier. The RLVR training is a meaningful differentiator over plain SFT Swallow variants. That said, near-zero community adoption means you are largely on your own if something breaks. Skip if you need function calling or want a quantized option with low quality loss.
›Why this rating
Auto-generated rating (Opus 4.7 judge, claude-opus-4-7). Overall 9.00/10. License (apache-2.0) is explicit in the card and commercial use is correctly flagged. Vendor, family, and 32B param count are verified. Context length of 32768 is reasonable for Qwen3 base but not explicitly stated in the excerpt — minor verification gap. Description is honest and operator-voiced, correctly noting RLVR focus on math/code, the GPTQ deprecation issue (directly from card), and weak community adoption. Use case is sharp (Japanese-English math/code reasoning). Weaknesses are concrete and useful for a local-AI operator deciding whether to deploy.
Flags: - contextLength 32768 not explicitly confirmed in visible card excerpt — inherited from Qwen3 base assumption - Tool/function calling claim ('not explicitly supported') is an inference, not directly stated in excerpt
Overview
A 32B Japanese-English model built on Qwen3, trained with continual pre-training, supervised fine-tuning, and reinforcement learning with verifiable rewards. The RL stage targets math, coding, and general reasoning. This is v0.2, so the training pipeline has had at least one revision.
Strengths
- Bilingual Japanese-English coverage on a capable 32B base
- RLVR training specifically targets math and coding — not just chat quality
- Apache-2.0 license, commercial use permitted
- 32K context window
Weaknesses
- Function calling / tool use not explicitly supported
- No reasoning toggle — you cannot switch chain-of-thought off
- GPTQ quantization deprecated by the vendor; quantized variants may underperform
- Very low community traction so far (3K downloads, 1 like) — limited real-world feedback
Quantization variants
Each quantization trades model quality for file size and VRAM. Q4_K_M is the most popular starting point.
| Quantization | File size | VRAM required |
|---|---|---|
| Q4_K_M | 17.6 GB | 23 GB |
Get the model
HuggingFace
Original weights
Source repository — direct quantization required.
Hardware that runs this
Cards with enough VRAM for at least one quantization of Qwen3 Swallow 32B RL v0.2.
Models worth comparing
Same parameter band, plus what's one tier above and below — so you can decide what actually fits your hardware.
Frequently asked
What's the minimum VRAM to run Qwen3 Swallow 32B RL v0.2?
Can I use Qwen3 Swallow 32B RL v0.2 commercially?
What's the context length of Qwen3 Swallow 32B RL v0.2?
Source: huggingface.co/tokyotech-llm/Qwen3-Swallow-32B-RL-v0.2
Reviewed by RunLocalAI Editorial. See our editorial policy for how we research and verify model claims.
Related — keep moving
Verify Qwen3 Swallow 32B RL v0.2 runs on your specific hardware before committing money.