06. Knowledge Distillation
Chapter 6 of 18 · 15 min
EXERCISE
Train a small CNN as a student model using a larger pre-trained CNN as a teacher. Compare student accuracy with and without soft target distillation. Measure whether distillation improves generalization on a held-out test set.