Japanese StableLM Instruct Gamma 7B
A 7B instruction-tuned model from Stability AI built specifically for Japanese, using the Mistral architecture. Quantized to GGUF by TheBloke, so it runs on consumer hardware without extra steps. Supports up to 32K context tokens.
If you need a locally-runnable Japanese instruction model with a long context window, this is a technically sound option at 7B. The Apache-2.0 license removes commercial friction. That said, the very low HF engagement means you are largely on your own if something breaks — there is little community debugging to lean on. Hedge: worth testing against your Japanese workload, but validate quality before committing it to production.
›Why this rating
Auto-generated rating (Opus 4.7 judge, claude-opus-4-7). Overall 9.25/10. License is explicit apache-2.0 in the card, commercial OK is correct. Metadata aligns: Mistral-based 7B, Stability AI vendor, GGUF quantized by TheBloke. Context length of 32768 is consistent with the underlying Mistral 7B base. Description and verdict are honest, operator-voiced, and flag the low engagement and absence of benchmarks. One minor blemish: useCases array contains 'german' which appears to be a typo/error — this is a Japanese-only model and the stray 'german' tag is misleading and should be removed before publish.
Flags: - useCases array contains 'german' — incorrect for a Japanese-only model; remove before publish - contextLength of 32768 inherited from Mistral base is plausible but not explicitly confirmed in the excerpt shown
Overview
A 7B instruction-tuned model from Stability AI built specifically for Japanese, using the Mistral architecture. Quantized to GGUF by TheBloke, so it runs on consumer hardware without extra steps. Supports up to 32K context tokens.
Strengths
- 32,768-token context window handles long documents or multi-turn conversations
- Instruction-tuned specifically for Japanese language tasks
- GGUF quantization means straightforward CPU and GPU deployment
- Apache-2.0 license — commercial use permitted
Weaknesses
- 7B scale; expect limitations on complex multi-step reasoning
- Japanese-focused — do not rely on it for other languages
- Low community traction: 7,596 downloads and 10 likes on HF suggest limited real-world validation
- No benchmark numbers provided to verify Japanese-language quality claims
Quantization variants
Each quantization trades model quality for file size and VRAM. Q4_K_M is the most popular starting point.
| Quantization | File size | VRAM required |
|---|---|---|
| Q4_K_M | 3.9 GB | 5 GB |
Get the model
HuggingFace
Original weights
Source repository — direct quantization required.
Hardware that runs this
Cards with enough VRAM for at least one quantization of Japanese StableLM Instruct Gamma 7B.
Models worth comparing
Same parameter band, plus what's one tier above and below — so you can decide what actually fits your hardware.
Frequently asked
What's the minimum VRAM to run Japanese StableLM Instruct Gamma 7B?
Can I use Japanese StableLM Instruct Gamma 7B commercially?
What's the context length of Japanese StableLM Instruct Gamma 7B?
Source: huggingface.co/TheBloke/japanese-stablelm-instruct-gamma-7B-GGUF
Reviewed by RunLocalAI Editorial. See our editorial policy for how we research and verify model claims.
Related — keep moving
Verify Japanese StableLM Instruct Gamma 7B runs on your specific hardware before committing money.