LLM-jp 4 8B Instruct
An 8B bilingual model from Japan's National Institute of Informatics, instruction-tuned via SFT on a Japanese/English corpus of 11.7T tokens. Supports up to 65k context. This is a research release, not a production-hardened model.
If you need a permissively licensed Japanese bilingual model with a large context window, LLM-jp-4 8B is a reasonable research pick. The SFT-only alignment is the real concern — don't deploy this in any user-facing product without adding your own safety layer. For internal tooling or experimentation it earns a cautious try, but production teams should wait for a better-aligned follow-up release or evaluate larger alternatives. Hedge.
›Why this rating
Auto-generated rating (Opus 4.7 judge, claude-opus-4-7). Overall 9.10/10. License (apache-2.0) is explicit in the card and the commercial flag is correctly set. Metadata (8B, 65,536 context, llama arch, llm-jp vendor) matches the card precisely. The description is honest and operator-voiced, correctly flagging that this instruct variant is SFT-only (unlike the thinking variant which uses DPO). Weaknesses honestly call out the alignment gap and low traction. Minor nit: the description claims '11.7T tokens' SFT corpus, which is almost certainly the pretraining corpus, not SFT — but this isn't shown in the excerpt so can't be fully verified; also no GGUF/quant guidance for deployability. Still clears the bar.
Flags: - Description says 'instruction-tuned via SFT on a Japanese/English corpus of 11.7T tokens' — 11.7T is almost certainly the pretraining token count, not SFT data; phrasing should be clarified - No mention of VRAM expectations or GGUF availability for local deployment
Overview
An 8B bilingual model from Japan's National Institute of Informatics, instruction-tuned via SFT on a Japanese/English corpus of 11.7T tokens. Supports up to 65k context. This is a research release, not a production-hardened model.
Strengths
- Genuine Japanese/English bilingual training — not a bolted-on adapter
- 65,536-token context window handles long documents
- Pretrained on a large 11.7T token corpus
- Apache-2.0 license, commercial use allowed
Weaknesses
- SFT-only alignment — no DPO or RLHF, so outputs can drift or be unsafe
- Safety tuning is explicitly incomplete at this research stage
- 8B parameter count limits performance on multi-step reasoning tasks
- Low community traction so far (13k downloads, 7 likes) — limited real-world validation
Quantization variants
Each quantization trades model quality for file size and VRAM. Q4_K_M is the most popular starting point.
| Quantization | File size | VRAM required |
|---|---|---|
| Q4_K_M | 4.4 GB | 6 GB |
Get the model
HuggingFace
Original weights
Source repository — direct quantization required.
Hardware that runs this
Cards with enough VRAM for at least one quantization of LLM-jp 4 8B Instruct.
Models worth comparing
Same parameter band, plus what's one tier above and below — so you can decide what actually fits your hardware.
Frequently asked
What's the minimum VRAM to run LLM-jp 4 8B Instruct?
Can I use LLM-jp 4 8B Instruct commercially?
What's the context length of LLM-jp 4 8B Instruct?
Source: huggingface.co/llm-jp/llm-jp-4-8b-instruct
Reviewed by RunLocalAI Editorial. See our editorial policy for how we research and verify model claims.
Related — keep moving
Verify LLM-jp 4 8B Instruct runs on your specific hardware before committing money.