GPT-OSS Swallow 20B RL v0.1
A 20B bilingual model from TokyoTech built on GPT-OSS via continual pre-training, SFT, and reinforcement learning with verifiable rewards (RLVR). Targets Japanese proficiency without sacrificing math and coding capability. Apache-2.0 licensed and commercially usable.
At 20B with a 32K context window and a clean Apache-2.0 license, this is a reasonable option if you need a commercially deployable Japanese-capable model with some reasoning muscle. The RLVR training is a genuine differentiator over plain SFT models, but the lack of published benchmarks makes it hard to know exactly how much it moves the needle. Skip if you need reliable tool/function-calling — that's untested. Hedge pick for teams willing to run their own evals before committing.
›Why this rating
Auto-generated rating (Opus 4.7 judge, claude-opus-4-7). Overall 9.10/10. License is explicit apache-2.0 in the card, commercial-OK flag is correct. Metadata (20B, vendor, family base of gpt-oss, bilingual JA/EN) matches the card. Context length of 32768 is a reasonable inherited claim from gpt-oss base but isn't directly stated in the excerpt — minor deduction. Editorial voice is operator-grade: honest about untested tool use, low traction, and absence of benchmarks in the draft. Use case is sharply scoped to Japanese-English bilingual reasoning + STEM. Deployability would be stronger with explicit VRAM guidance and a GGUF note, but weaknesses are candid enough.
Flags: - contextLength 32768 inherited from gpt-oss base rather than directly verified in this card excerpt - No VRAM/quantization guidance for a 20B MoE-derived model — readers may underestimate footprint
Overview
A 20B bilingual model from TokyoTech built on GPT-OSS via continual pre-training, SFT, and reinforcement learning with verifiable rewards (RLVR). Targets Japanese proficiency without sacrificing math and coding capability. Apache-2.0 licensed and commercially usable.
Strengths
- Bilingual Japanese-English with deliberate Japanese enhancement during training
- RLVR pipeline aimed at improving reasoning quality
- Retained STEM (math and coding) performance from base model
- Apache-2.0 — no commercial restrictions
Weaknesses
- Tool use and function calling are untested — don't assume compatibility
- Model identity responses may be inconsistent or unreliable
- Low community traction so far (2,683 downloads, 20 likes on HF)
- No published benchmark numbers in the draft — hard to verify STEM claims independently
Quantization variants
Each quantization trades model quality for file size and VRAM. Q4_K_M is the most popular starting point.
| Quantization | File size | VRAM required |
|---|---|---|
| Q4_K_M | 11.0 GB | 14 GB |
Get the model
HuggingFace
Original weights
Source repository — direct quantization required.
Hardware that runs this
Cards with enough VRAM for at least one quantization of GPT-OSS Swallow 20B RL v0.1.
Models worth comparing
Same parameter band, plus what's one tier above and below — so you can decide what actually fits your hardware.
Frequently asked
What's the minimum VRAM to run GPT-OSS Swallow 20B RL v0.1?
Can I use GPT-OSS Swallow 20B RL v0.1 commercially?
What's the context length of GPT-OSS Swallow 20B RL v0.1?
Source: huggingface.co/tokyotech-llm/GPT-OSS-Swallow-20B-RL-v0.1
Reviewed by RunLocalAI Editorial. See our editorial policy for how we research and verify model claims.
Related — keep moving
Verify GPT-OSS Swallow 20B RL v0.1 runs on your specific hardware before committing money.