16. Evaluation
Chapter 16 of 24 · 20 min
EXERCISE
: Implement Task-Specific Evaluation
Create a custom evaluation function for text classification that computes per-class precision and recall:
def evaluate_classification(model, dataloader, id2label, device):
"""Returns per-class metrics."""
from collections import defaultdict
model.eval()
predictions = []
references = []
with torch.no_grad():
for batch in dataloader:
# Implementation here
pass
# Compute confusion matrix and derive metrics
from sklearn.metrics import classification_report
print(classification_report(references, predictions, target_names=id2label.values()))