internlm
8B parameters
Restricted
Reviewed June 2026

InternLM 3 8B

Shanghai AI Lab's open-research line. InternLM 3 at 8B; strong on Chinese-language tasks.

License: InternLM License·Released Oct 5, 2025·Context: 32,768 tokens
BLK · VERDICT

Our verdict

OP · Fredoline Eruo|VERIFIED JUN 12, 2026
unrated

Positioning

InternLM 3 8B is a dense 8-billion-parameter model released by Shanghai AI Lab under the InternLM License. With a 32,768-token context window, it is designed for Chinese-language consumer workloads. As part of the open-research InternLM line, it offers a permissive license for commercial use while targeting strong performance on Chinese-language tasks.

Strengths

  • Chinese-language focus: Built and optimized for Chinese-language workloads, making it a strong choice for applications requiring native Chinese understanding.
  • Permissive license: The InternLM License allows commercial deployment, giving operators flexibility for proprietary use.
  • Consumer-friendly size: At 8B parameters, the model fits within consumer GPU memory constraints, especially with quantization.
  • Long context: A 32K context window supports extended conversations or document processing without truncation.

Limitations

  • No community benchmarks available: We do not have independent measurements for this model. Operators should treat published vendor metrics as best-case until verified.
  • Niche language strength: While strong on Chinese, its performance on English or other languages is unverified and may be weaker than general-purpose models.
  • Dense architecture: Unlike Mixture-of-Experts models, all 8B parameters are active per forward pass, meaning compute cost scales linearly with parameter count.
  • License restrictions: The InternLM License may have specific terms that differ from Apache 2.0 or MIT; operators should review the full license text before deployment.

What it takes to run this locally

At FP16, the model requires ~16 GB of disk space. Quantized versions reduce this significantly: Q8_0 ~9 GB, Q6_K ~6.6 GB, Q5_K_M ~5.7 GB, Q4_K_M ~4.5 GB, Q3_K_M ~3.9 GB, Q2_K ~2.6 GB. Add ~30-50% for KV cache and framework overhead at typical context lengths. This places the model in the consumer deployment class: a single GPU with 12-24 GB VRAM can run Q4_K_M or lower quantizations comfortably.

Should you run this locally?

Yes if you need a permissively licensed model optimized for Chinese-language tasks and have a consumer GPU with at least 8 GB VRAM (for Q4_K_M or lower). The 32K context window is beneficial for Chinese document processing or long-form dialogue.

No if your primary language is English or you require verified benchmark performance. Without community benchmarks, the model's capabilities are uncertain. Consider a more widely tested model if reproducibility is critical.

Catalog cross-links

  • InternLM 2 7B
  • Qwen 2.5 7B
  • Consumer GPU guide

Overview

Shanghai AI Lab's open-research line. InternLM 3 at 8B; strong on Chinese-language tasks.

Strengths

  • Chinese-language strength
  • Active research lineage

Weaknesses

  • Commercial use restricted

Quantization variants

Each quantization trades model quality for file size and VRAM. Q4_K_M is the most popular starting point.

QuantizationFile sizeVRAM required
Q4_K_M4.7 GB6 GB

Get the model

HuggingFace

Original weights

huggingface.co/internlm/internlm3-8b-instruct

Source repository — direct quantization required.

Hardware that runs this

Cards with enough VRAM for at least one quantization of InternLM 3 8B.

Compare alternatives

Models worth comparing

Same parameter band, plus what's one tier above and below — so you can decide what actually fits your hardware.

Frequently asked

What's the minimum VRAM to run InternLM 3 8B?

6GB of VRAM is enough to run InternLM 3 8B at the Q4_K_M quantization (file size 4.7 GB). Higher-quality quantizations need more.

Can I use InternLM 3 8B commercially?

InternLM 3 8B is released under the InternLM License, which has restrictions for commercial use. Review the license terms before using it in a product.

What's the context length of InternLM 3 8B?

InternLM 3 8B supports a context window of 32,768 tokens (about 33K).

Source: huggingface.co/internlm/internlm3-8b-instruct

Reviewed by RunLocalAI Editorial. See our editorial policy for how we research and verify model claims.

Related — keep moving

Before you buy

Verify InternLM 3 8B runs on your specific hardware before committing money.