Salamandra 2B

Salamandra 2B is a base-only transformer trained from scratch by Barcelona Supercomputing Center on 12.875 trillion tokens across 35 European languages and code. At 2.25B parameters and an 8192-token context window, it is one of the few models with serious native Spanish and co-official Iberian language coverage built in from pretraining. It is not instruction-tuned — you will need to fine-tune it before it is useful in a product.

License: apache-2.0·Context: 8,192 tokens

BLK · VERDICT

Our verdict

OP · Eruo Fredoline|VERIFIED MAY 29, 2026

9.4/10

If you need a clean, commercially licensed base model with genuine Spanish-region language depth to fine-tune on your own data, Salamandra 2B is a reasonable starting point. BSC-LT built this from scratch rather than adapting an English-first model, which matters for Iberian language quality at the token level. That said, this is strictly a base model — do not deploy it raw. If you want something you can run today without fine-tuning work, skip this and look at an instruction-tuned alternative.

›Why this rating

Auto-generated rating (Opus 4.7 judge, claude-opus-4-7). Overall 9.40/10. License (Apache 2.0) is explicitly stated in the model card and correctly marked commercial-friendly. Metadata is precisely verified: 2,253,490,176 params rounds to 2.25B, context 8192, family llama (architecture matches), vendor BSC-LT all check out. Editorial voice is honest and operator-grade — explicitly flags base-only and warns against raw deployment. Use case is appropriately sharp (Iberian language fine-tuning). Brand fit is solid for the European/multilingual local-AI niche, slightly narrower than mainstream English models but legitimate. Weaknesses honestly note the adoption gap.

Overview

Strengths

Strong Spanish and European language coverage from pretraining — not retrofitted
12.875T token training corpus, large for a 2B-class model
8192-token context window is generous at this parameter count
Apache 2.0 license, fully commercial-friendly

Weaknesses

Base model only — requires fine-tuning before any instruction-following use
2.25B parameters will underperform larger models on complex reasoning or generation tasks
No multilingual coverage outside Europe
Low adoption so far (2,120 downloads, 25 likes) — limited community troubleshooting resources