Dolphin 3.0 Mistral 24B
Eric Hartford's Dolphin fine-tune of Mistral Small 3 — uncensored, function-calling, agent-friendly.
The uncensored counterpart to Mistral Small 3 24B. Same hardware footprint, same Apache base license, refusal layer minimized. Right pick for research, red-team, or technical-writing workflows where Mistral's base alignment gets in the way.
Strengths- Apache base license carries through — uncensored and license-clean.
- 24B body is meaningfully more capable than Hermes 3's 8B base.
- Strong technical/research output — handles dual-use prompts that base models refuse.
- Niche use case — most users want base Mistral Small 3 24B.
- Creative writing slightly worse than base — alignment training adds polish.
- Refusals minimized but not zero — output still requires user judgment.
- Q4_K_M (14.6 GB): 75–92 tok/s decode — full GPU
- Q5_K_M (17.3 GB): 62–78 tok/s
- Q8_0 (26 GB): partial offload only
Yes, for technical research and security workflows where base Mistral refusals block legitimate work. No, for general chat — base Mistral Small 3 24B is the right default.
How it compares- vs Mistral Small 3 24B (base) → Dolphin minus alignment layer. Pick base for general use, Dolphin for research/technical.
- vs Hermes 3 Llama 3.1 8B → Dolphin is bigger and Apache-licensed; Hermes is smaller and on Llama base.
- vs Hermes 3 Llama 3.1 70B → 70B Hermes is smarter; Dolphin is more accessible (full GPU on 24 GB) and Apache-licensed.
ollama pull dolphin3:24b-mistral-q4_K_M
ollama run dolphin3:24b-mistral-q4_K_M
Settings: Q4_K_M GGUF, 16384 ctx, full GPU on RTX 4090
›Why this rating
7.5/10 — Eric Hartford's uncensored/aligned-removed fine-tune of Mistral Small 3 24B. Same caveats as Hermes 3 but on a stronger 24B base. Loses points to the base Mistral Small 3 24B for general use.
Overview
Eric Hartford's Dolphin fine-tune of Mistral Small 3 — uncensored, function-calling, agent-friendly.
Strengths
- Uncensored
- Function calling
- Apache 2.0
Weaknesses
- Uncensored = use carefully in production
Quantization variants
Each quantization trades model quality for file size and VRAM. Q4_K_M is the most popular starting point.
| Quantization | File size | VRAM required |
|---|---|---|
| Q4_K_M | 14.0 GB | 18 GB |
Get the model
Ollama
One-line install
ollama run dolphin-mistral:24bRead our Ollama review →HuggingFace
Original weights
Source repository — direct quantization required.
Hardware that runs this
Cards with enough VRAM for at least one quantization of Dolphin 3.0 Mistral 24B.
Models worth comparing
Same parameter band, plus what's one tier above and below — so you can decide what actually fits your hardware.
Frequently asked
What's the minimum VRAM to run Dolphin 3.0 Mistral 24B?
Can I use Dolphin 3.0 Mistral 24B commercially?
What's the context length of Dolphin 3.0 Mistral 24B?
How do I install Dolphin 3.0 Mistral 24B with Ollama?
Source: huggingface.co/cognitivecomputations/Dolphin3.0-Mistral-24B
Reviewed by RunLocalAI Editorial. See our editorial policy for how we research and verify model claims.