wizard
141B parameters
Commercial OK
WizardLM-2 8x22B
Microsoft's RLHF-heavy fine-tune of Mixtral 8x22B. Briefly the top open chat model on LMSYS at release.
License: Apache 2.0·Released Apr 15, 2024·Context: 65,536 tokens
Overview
Microsoft's RLHF-heavy fine-tune of Mixtral 8x22B. Briefly the top open chat model on LMSYS at release.
Strengths
- Strong chat quality
- Apache 2.0
Weaknesses
- Workstation-only
- Older
Quantization variants
Each quantization trades model quality for file size and VRAM. Q4_K_M is the most popular starting point.
| Quantization | File size | VRAM required |
|---|---|---|
| Q4_K_M | 84.0 GB | 96 GB |
Get the model
HuggingFace
Original weights
huggingface.co/microsoft/WizardLM-2-8x22B
Source repository — direct quantization required.
Hardware that runs this
Cards with enough VRAM for at least one quantization of WizardLM-2 8x22B.
Compare alternatives
Models worth comparing
Same parameter band, plus what's one tier above and below — so you can decide what actually fits your hardware.
Same tier
Models in the same parameter band as this one
Step up
More capable — bigger memory footprint
No verdicted models in the next tier up yet.
Step down
Smaller — faster, runs on weaker hardware
Frequently asked
What's the minimum VRAM to run WizardLM-2 8x22B?
96GB of VRAM is enough to run WizardLM-2 8x22B at the Q4_K_M quantization (file size 84.0 GB). Higher-quality quantizations need more.
Can I use WizardLM-2 8x22B commercially?
Yes — WizardLM-2 8x22B ships under the Apache 2.0, which permits commercial use. Always read the license text before deployment.
What's the context length of WizardLM-2 8x22B?
WizardLM-2 8x22B supports a context window of 65,536 tokens (about 66K).
Source: huggingface.co/microsoft/WizardLM-2-8x22B
Reviewed by RunLocalAI Editorial. See our editorial policy for how we research and verify model claims.