15. Compression Benchmarking

Chapter 15 of 18 · 25 min
EXERCISE

Build a benchmarking script that measures inference latency, memory usage, and accuracy for a base model and three compressed variants. Run 100 warmup iterations before collecting measurements. Report mean and P99 latency.