llama
8B parameters
Commercial OK
Reviewed May 2026

LLM-jp 4 8B Instruct

An 8B bilingual model from Japan's National Institute of Informatics, instruction-tuned via SFT on a Japanese/English corpus of 11.7T tokens. Supports up to 65k context. This is a research release, not a production-hardened model.

License: apache-2.0·Context: 65,536 tokens
BLK · VERDICT

Our verdict

OP · Fredoline Eruo|VERIFIED MAY 28, 2026
9.1/10

If you need a permissively licensed Japanese bilingual model with a large context window, LLM-jp-4 8B is a reasonable research pick. The SFT-only alignment is the real concern — don't deploy this in any user-facing product without adding your own safety layer. For internal tooling or experimentation it earns a cautious try, but production teams should wait for a better-aligned follow-up release or evaluate larger alternatives. Hedge.

Why this rating

Auto-generated rating (Opus 4.7 judge, claude-opus-4-7). Overall 9.10/10. License (apache-2.0) is explicit in the card and the commercial flag is correctly set. Metadata (8B, 65,536 context, llama arch, llm-jp vendor) matches the card precisely. The description is honest and operator-voiced, correctly flagging that this instruct variant is SFT-only (unlike the thinking variant which uses DPO). Weaknesses honestly call out the alignment gap and low traction. Minor nit: the description claims '11.7T tokens' SFT corpus, which is almost certainly the pretraining corpus, not SFT — but this isn't shown in the excerpt so can't be fully verified; also no GGUF/quant guidance for deployability. Still clears the bar.

Flags: - Description says 'instruction-tuned via SFT on a Japanese/English corpus of 11.7T tokens' — 11.7T is almost certainly the pretraining token count, not SFT data; phrasing should be clarified - No mention of VRAM expectations or GGUF availability for local deployment

Overview

An 8B bilingual model from Japan's National Institute of Informatics, instruction-tuned via SFT on a Japanese/English corpus of 11.7T tokens. Supports up to 65k context. This is a research release, not a production-hardened model.

Strengths

  • Genuine Japanese/English bilingual training — not a bolted-on adapter
  • 65,536-token context window handles long documents
  • Pretrained on a large 11.7T token corpus
  • Apache-2.0 license, commercial use allowed

Weaknesses

  • SFT-only alignment — no DPO or RLHF, so outputs can drift or be unsafe
  • Safety tuning is explicitly incomplete at this research stage
  • 8B parameter count limits performance on multi-step reasoning tasks
  • Low community traction so far (13k downloads, 7 likes) — limited real-world validation

Quantization variants

Each quantization trades model quality for file size and VRAM. Q4_K_M is the most popular starting point.

QuantizationFile sizeVRAM required
Q4_K_M4.4 GB6 GB

Get the model

HuggingFace

Original weights

huggingface.co/llm-jp/llm-jp-4-8b-instruct

Source repository — direct quantization required.

Hardware that runs this

Cards with enough VRAM for at least one quantization of LLM-jp 4 8B Instruct.

Compare alternatives

Models worth comparing

Same parameter band, plus what's one tier above and below — so you can decide what actually fits your hardware.

Frequently asked

What's the minimum VRAM to run LLM-jp 4 8B Instruct?

6GB of VRAM is enough to run LLM-jp 4 8B Instruct at the Q4_K_M quantization (file size 4.4 GB). Higher-quality quantizations need more.

Can I use LLM-jp 4 8B Instruct commercially?

Yes — LLM-jp 4 8B Instruct ships under the apache-2.0, which permits commercial use. Always read the license text before deployment.

What's the context length of LLM-jp 4 8B Instruct?

LLM-jp 4 8B Instruct supports a context window of 65,536 tokens (about 66K).

Source: huggingface.co/llm-jp/llm-jp-4-8b-instruct

Reviewed by RunLocalAI Editorial. See our editorial policy for how we research and verify model claims.

Related — keep moving

Before you buy

Verify LLM-jp 4 8B Instruct runs on your specific hardware before committing money.