other
8B parameters
Restricted
Reviewed June 2026

Aya 23 8B

Cohere's multilingual research model covering 23 languages. CC-BY-NC — research only.

License: CC-BY-NC 4.0·Released May 23, 2024·Context: 8,192 tokens
BLK · VERDICT

Our verdict

OP · Fredoline Eruo|VERIFIED JUN 12, 2026
unrated

Positioning

Aya 23 8B is a dense 8-billion-parameter multilingual model released by Cohere For AI under the CC-BY-NC 4.0 license, which restricts use to research and non-commercial applications. With a context window of 8,192 tokens, it covers 23 languages, making it a specialized tool for multilingual NLP research. Its relatively small size and dense architecture place it firmly in the consumer deployment class, accessible to operators with modest hardware.

Strengths

  • Multilingual coverage: Supports 23 languages, making it a rare open-weight option for cross-lingual research without requiring multiple models.
  • Consumer-friendly size: At 8B parameters, quantized versions fit comfortably on single consumer GPUs (e.g., Q4_K_M ~4.5 GB on disk).
  • Dense architecture simplicity: Unlike mixture-of-experts models, dense models have predictable memory and compute requirements, simplifying deployment.
  • Research-focused licensing: CC-BY-NC encourages academic exploration and reproducibility, though commercial use is prohibited.

Limitations

  • Non-commercial license only: CC-BY-NC prohibits commercial deployment, limiting its use in production or revenue-generating applications.
  • Modest context length: 8,192 tokens may be insufficient for long-document tasks or extended conversations.
  • No community benchmarks available: We do not have verified operator measurements for this model; published vendor metrics should be treated as best-case.
  • Single-language performance unknown: While multilingual, its per-language quality may vary; operators should evaluate for their specific target languages.

What it takes to run this locally

At 8B parameters, quantized model file sizes range from 16 GB (FP16) down to ~2.6 GB (Q2_K). For practical use, a Q4_K_M (4.5 GB) or Q5_K_M (~5.7 GB) quant offers a good balance of quality and memory footprint. Add ~30-50% for KV cache and framework overhead at typical context lengths. This fits comfortably on a single consumer GPU with 8-12 GB VRAM (e.g., RTX 3060 or 4060). No specific tokens-per-second claims are available.

Should you run this locally?

Yes if you are conducting multilingual NLP research and need a permissively licensed (for research) model that runs on consumer hardware. Its small size and dense architecture make it easy to experiment with on a single GPU.

No if you need commercial deployment rights, require longer context windows, or need state-of-the-art performance in a single language. The CC-BY-NC license and 8K context limit may be dealbreakers for production use.

Catalog cross-links

Overview

Cohere's multilingual research model covering 23 languages. CC-BY-NC — research only.

Family & lineage

How this model relates to others in its lineage. Family members share architecture and training-data roots; parent / children edges record direct distillation or fine-tune relationships.

Family siblings (aya)
Distilled / fine-tuned from this

Strengths

  • 23 languages
  • Research-friendly

Weaknesses

  • Non-commercial license

Quantization variants

Each quantization trades model quality for file size and VRAM. Q4_K_M is the most popular starting point.

QuantizationFile sizeVRAM required
Q4_K_M5.0 GB8 GB

Get the model

HuggingFace

Original weights

huggingface.co/CohereForAI/aya-23-8B

Source repository — direct quantization required.

Hardware that runs this

Cards with enough VRAM for at least one quantization of Aya 23 8B.

Compare alternatives

Models worth comparing

Same parameter band, plus what's one tier above and below — so you can decide what actually fits your hardware.

Frequently asked

What's the minimum VRAM to run Aya 23 8B?

8GB of VRAM is enough to run Aya 23 8B at the Q4_K_M quantization (file size 5.0 GB). Higher-quality quantizations need more.

Can I use Aya 23 8B commercially?

Aya 23 8B is released under the CC-BY-NC 4.0, which has restrictions for commercial use. Review the license terms before using it in a product.

What's the context length of Aya 23 8B?

Aya 23 8B supports a context window of 8,192 tokens (about 8K).

Source: huggingface.co/CohereForAI/aya-23-8B

Reviewed by RunLocalAI Editorial. See our editorial policy for how we research and verify model claims.

Related — keep moving

Before you buy

Verify Aya 23 8B runs on your specific hardware before committing money.