AMD Radeon RX 6950 XT
Refreshed 6900 XT with faster GDDR6 (576 GB/s). 16 GB VRAM, slightly more compute. ROCm officially supported. ~110-145 tok/s on 7B Q4. The bandwidth bump matters for inference; at $520-620 used it competes with used 3090 12GB on raw VRAM and beats it on bandwidth.
AMD Radeon RX 6950 XT
Affiliate disclosure: as an Amazon Associate and partner of other retailers, we earn from qualifying purchases. The verdict on this page is our editorial opinion; affiliate links never influence what we recommend.
Extrapolated from 576 GB/s bandwidth — 57.6 tok/s estimated. No measured benchmarks yet.
Plain-English: Comfortable at 14B and below — snappy enough for a coding agent.
Verdicts extrapolated from catalog VRAM + bandwidth + ecosystem flags. Hover any chip for the rationale. Want measured numbers? Submit your own run with runlocalai-bench --submit.
The RX 6950 XT is for the operator who needs 16 GB VRAM for 13B-30B models and wants inference speed that beats a used 3090, but doesn't need CUDA-exclusive software or multi-GPU scaling. It runs 7B Q4 at ~110-145 tok/s and 13B Q4 at ~60-80 tok/s, thanks to 576 GB/s bandwidth. 30B Q4 models fit in VRAM and run ~25-35 tok/s. What breaks: ROCm support is narrower than CUDA; some frameworks (e.g., vLLM, ExLlama) lack full ROCm optimization, and flash attention may be missing. Multi-GPU setups are harder to configure. When to pass: if the workload requires CUDA-only tools (e.g., TensorRT-LLM), or if 24 GB VRAM is needed for 34B+ models. At $520-620 used, this card offers better inference speed than a used 3090 for the same VRAM, but loses on software ecosystem breadth.
›Why this rating
Strong inference speed and 16 GB VRAM for under $600 make it a solid choice for local LLMs, but ROCm's narrower software support and lack of CUDA compatibility hold it back from a higher rating. It's a niche pick for operators who prioritize raw bandwidth and can work around ROCm limitations.
Overview
Refreshed 6900 XT with faster GDDR6 (576 GB/s). 16 GB VRAM, slightly more compute. ROCm officially supported. ~110-145 tok/s on 7B Q4. The bandwidth bump matters for inference; at $520-620 used it competes with used 3090 12GB on raw VRAM and beats it on bandwidth.
Some links above are affiliate links. We may earn a commission at no extra cost to you. How we make money.
Specs
| VRAM | 16 GB |
| Power draw | 335 W |
| Released | 2022 |
| MSRP | $1099 |
| Backends | ROCm Vulkan |
Models that fit
Open-weight models small enough to run on AMD Radeon RX 6950 XT with usable context.
Frequently asked
What models can AMD Radeon RX 6950 XT run?
Does AMD Radeon RX 6950 XT support CUDA?
How much does AMD Radeon RX 6950 XT cost?
Where next?
Reviewed by RunLocalAI Editorial. See our editorial policy for how we research and verify hardware specifications.