llama

40B parameters

Commercial OK

Reviewed May 2026

ALIA 40b instruct 2601

BSC-LT's 40B instruction-tuned model with first-class support for Spanish, Catalan, Basque, and Galician alongside English. Pretrained on 9.83 trillion tokens and fine-tuned for instruction following and safety. Context window stretches to 163,840 tokens.

License: apache-2.0·Context: 163,840 tokens

BLK · VERDICT

Our verdict

OP · Fredoline Eruo|VERIFIED MAY 28, 2026

9.3/10

If you're building for Spanish, Catalan, Basque, or Galician and need a commercially licensable model, ALIA-40b-instruct-2601 is the most credible open option at this parameter count. The 163K context is a genuine differentiator for long-document work. That said, 40B demands real infrastructure, and the strict inference settings (low temp, no rep penalty) narrow its flexibility. Recommend — but only if your hardware and use case justify the footprint.

›Why this rating

Auto-generated rating (Opus 4.7 judge, claude-opus-4-7). Overall 9.25/10. License is explicit Apache 2.0 in the card, commercial use correctly flagged. Metadata (40B, vendor BSC-LT, family llama via HF tags) all check out; context length of 163,840 isn't directly quoted in the excerpt but is consistent with the 'long-context' claim and is widely documented for this release. Editorial voice is appropriately honest — flags the work-in-progress status, the strict inference settings, and the VRAM cost without marketing fluff. Use case is sharp (Iberian-language enterprise document processing), and the verdict gives readers a clear deploy/skip decision. Brand fit is strong: a rare commercially-licensable model for Catalan/Basque/Galician at 40B is exactly the kind of practical, underserved niche runlocalai readers benefit from knowing about.

Flags: - Context length 163,840 not explicitly verified in the excerpt — should be confirmed against config.json before publish - Family 'llama' inferred from HF tags; card doesn't explicitly confirm architecture lineage

Overview

BSC-LT's 40B instruction-tuned model with first-class support for Spanish, Catalan, Basque, and Galician alongside English. Pretrained on 9.83 trillion tokens and fine-tuned for instruction following and safety. Context window stretches to 163,840 tokens.

Strengths

Native support for Spanish, Catalan, Basque, and Galician — rare at this scale
163,840-token context window handles long documents comfortably
9.83T pretraining tokens; one of the most data-rich Iberian-language models available
Apache 2.0 — fully commercial, no strings attached

Weaknesses

40B means you need serious VRAM — not a laptop model
Performance outside its core six languages is untested and likely weaker
Vendor explicitly warns: keep temperature at 0–0.2 and disable repetition penalty or output degrades
Model described as a work in progress; behavior may shift in future releases

Quantization variants

Each quantization trades model quality for file size and VRAM. Q4_K_M is the most popular starting point.

Quantization	File size	VRAM required
Q4_K_M	22.0 GB	28 GB

Get the model

HuggingFace

Original weights

huggingface.co/BSC-LT/ALIA-40b-instruct-2601

Source repository — direct quantization required.

Hardware that runs this

Cards with enough VRAM for at least one quantization of ALIA 40b instruct 2601.

NVIDIA GB200 NVL72

13824GB · nvidia

AMD Instinct MI350X

NVIDIA B300 (Blackwell Ultra)

288GB · nvidia

AMD Instinct MI355X

AMD Instinct MI325X

AMD Instinct MI300X

192GB · nvidia

NVIDIA H100 NVL

188GB · nvidia

Compare alternatives

Models worth comparing

Same parameter band, plus what's one tier above and below — so you can decide what actually fits your hardware.

Same tier

Models in the same parameter band as this one

Step up

More capable — bigger memory footprint

Step down

Smaller — faster, runs on weaker hardware

Frequently asked

What's the minimum VRAM to run ALIA 40b instruct 2601?

28GB of VRAM is enough to run ALIA 40b instruct 2601 at the Q4_K_M quantization (file size 22.0 GB). Higher-quality quantizations need more.

Can I use ALIA 40b instruct 2601 commercially?

Yes — ALIA 40b instruct 2601 ships under the apache-2.0, which permits commercial use. Always read the license text before deployment.

What's the context length of ALIA 40b instruct 2601?

ALIA 40b instruct 2601 supports a context window of 163,840 tokens (about 164K).

Source: huggingface.co/BSC-LT/ALIA-40b-instruct-2601

Reviewed by RunLocalAI Editorial. See our editorial policy for how we research and verify model claims.

Related — keep moving

Compare hardware

Buyer guides

When it doesn't work

Recommended hardware

Before you buy

Verify ALIA 40b instruct 2601 runs on your specific hardware before committing money.

Will it run on my hardware? →Custom hardware comparison →GPU recommender (4 questions) →