AMD Instinct MI300A (APU)
No editorial image yet — generic vendor mark shown. Credentials in spec table below.
Combined CPU + GPU APU with 128GB unified HBM3. Powers the El Capitan supercomputer.
Sub-scores sum to 886 / 1000. Headline = 886 × 0.70 (Estimated-confidence discount) = 620. This is an algorithmic performance-tier score — distinct from, and often lower than, the editorial “Our verdict” below, which weighs value and real-world fit (especially for hardware we haven’t measured yet). How scoring works →
Extrapolated from 5300 GB/s bandwidth — 530.0 tok/s estimated. No measured benchmarks yet.
Plain-English: Runs 70B comfortably — snappy enough for a coding agent.
Verdicts extrapolated from catalog VRAM + bandwidth + ecosystem flags. Hover any chip for the rationale. Want measured numbers? Submit your own run with runlocalai-bench --submit.
What it does well
The MI300A is AMD's CPU+GPU APU — 24 Zen 4 cores + 228 CDNA 3 compute units + 128 GB unified HBM3 memory at 5.3 TB/s, all on a single package. The architecture eliminates the CPU↔GPU memory transfer overhead that bottlenecks traditional discrete-GPU systems: the Zen 4 cores and CDNA 3 GPU share the same physical HBM3 memory pool with full coherent access. This is the chip in the El Capitan supercomputer (LLNL), the world's first sustained-exaflop classical computer. For LLM workloads, the unified-memory architecture is genuinely useful: 128 GB on-chip memory + Infinity Fabric clustering means an 8× MI300A node has 1 TB combined HBM at coherent bandwidth, which is a meaningful advantage over traditional CPU↔GPU systems for memory-bound inference + training workflows. ROCm 6+ supports MI300A first-class. Cap-ex is OEM/integrator-only — typically $20,000-$30,000 per APU socket.
Where it breaks
- OEM/integrator-only procurement. MI300A doesn't ship to typical enterprises — you buy it as part of an HPE Cray supercomputer or HPE ProLiant XD685+ APU server. Lead times measured in months, MOQs measured in racks.
- No CUDA — full stop. AMD ROCm ecosystem only. Same long-tail framework compatibility constraints as MI300X and other Instinct cards.
- Architecture is APU, not pure-GPU. The 24 Zen 4 cores are useful for orchestration but the GPU compute density is lower than MI300X (228 CUs vs 304 CUs) due to silicon die budget shared with the CPU. For pure GPU workloads, MI300X wins.
- Software stack tuned for HPC, not pure LLM. El Capitan's workload mix is HPC scientific simulation, weather modeling, etc. — LLM-specific optimization on MI300A is less mature than MI300X.
- Resale and used-market liquidity is essentially zero. Decommissioned El Capitan racks may eventually surface, but transaction volume will be tiny.
- Power and cooling infrastructure is HPC-tier. 760 W TDP per APU socket, liquid cooling required for sustained workloads.
Ideal model range
- Sweet spot: HPC + LLM hybrid workloads where CPU↔GPU coherence advantage genuinely matters (specific scientific computing + AI fusion workflows).
- Sweet spot: Multi-tenant production inference at supercomputer scale where 8× APU node = 1 TB combined HBM is genuinely useful.
- Sweet spot: Trillion-parameter foundation model training where unified memory architecture reduces transfer overhead vs traditional discrete GPU.
- Sweet spot: National lab / sovereign AI deployments where MI300A's specific El Capitan provenance is the procurement vehicle.
- Bad fit: Pure LLM production inference (MI300X is better), single-card workloads (wrong tier), enterprise procurement (wrong channel).
Bad use cases
- Standard enterprise procurement. Pick MI300X or NVIDIA equivalents.
- Pure LLM serving. MI300X has more GPU CUs at similar memory tier.
- CUDA-locked stacks. Don't pick AMD if your toolchain requires CUDA.
- Anyone reading this for buying decision purposes. This isn't a buying decision — it's reference info on AMD's APU architecture that powers El Capitan.
- Cost-conscious anything. Wrong tier entirely.
- Workstation deployment. Rack/HPC-only.
Verdict
Buy this if you're spec'ing HPC infrastructure (national lab, defense, large pharma) where MI300A's specific HPC + AI fusion capability matters, you have OEM relationships with HPE Cray for supercomputer-scale procurement, and your workload genuinely benefits from CPU+GPU coherent unified memory at the rack scale. MI300A is the right pick for the narrow HPC + LLM hybrid use case.
Skip this if you're a typical enterprise (pick MI300X or MI325X for AMD; H200 or B200 for NVIDIA), you're pure-LLM serving (MI300X has more GPU compute), CUDA-locked, or you can't budget OEM/integrator-only procurement. For most readers, this verdict is informational reference, not a buying decision.
How it compares
- vs MI300X (192 GB) → MI300X has 50% more memory + 304 CUs (33% more) + standard PCIe procurement at $20k cap-ex. MI300A has CPU+GPU coherent unified memory + APU integration at $25-30k OEM. Pick MI300X for typical enterprise; MI300A for HPC + AI fusion specific use cases. See /compare/amd-mi300a-vs-amd-mi300x.
- vs GB200 NVL72 → GB200 NVL72 is the equivalent NVIDIA platform for trillion-parameter scale at $3M+ rack. MI300A in HPE Cray rack form is similar tier on AMD ecosystem. Pick by ecosystem alignment + scale.
- vs Grace Hopper Superchip → NVIDIA's equivalent CPU+GPU integrated platform on the Hopper generation. Different ecosystem, similar architectural concept.
- vs DGX H200 → DGX H200 is 8× discrete H200 SXM5 in 8U at ~$300k. MI300A in HPE Cray APU server form is supercomputer-tier procurement. Wrong comparison — different scales.
Overview
Combined CPU + GPU APU with 128GB unified HBM3. Powers the El Capitan supercomputer.
Search-fallback link — editorial hasn't yet curated a retailer URL for this card.
Some links above are affiliate links. We may earn a commission at no extra cost to you. How we make money.
Specs
| VRAM | 128 GB |
| Power draw (peak) | 760 W |
| Released | 2023 |
| Backends | ROCm |
Models that fit
Open-weight models small enough to run on AMD Instinct MI300A (APU) with usable context.
Frequently asked
What models can AMD Instinct MI300A (APU) run?
Does AMD Instinct MI300A (APU) support CUDA?
Where next?
Reviewed by RunLocalAI Editorial. See our editorial policy for how we research and verify hardware specifications.