24. Data Visualization with Matplotlib
Data visualization isn't decoration—it's how you debug pipelines and understand model behavior. Matplotlib is the foundation: flexible, mature, and the basis of most Python viz libraries.
Getting started with the pyplot interface:
import matplotlib.pyplot as plt
import numpy as np
# Generate sample training metrics
epochs = range(1, 51)
train_loss = 2.5 * np.exp(-0.05 * np.array(list(epochs))) + np.random.normal(0, 0.05, 50)
val_loss = 2.3 * np.exp(-0.04 * np.array(list(epochs))) + np.random.normal(0, 0.08, 50)
val_accuracy = 0.3 + 0.65 * (1 - np.exp(-0.08 * np.array(list(epochs)))) + np.random.normal(0, 0.02, 50)
# Basic plot
fig, axes = plt.subplots(1, 2, figsize=(14, 5))
# Left: Loss curves
axes[0].plot(epochs, train_loss, label='Train Loss', color='#1f77b4')
axes[0].plot(epochs, val_loss, label='Val Loss', color='#ff7f0e', linestyle='--')
axes[0].set_xlabel('Epoch')
axes[0].set_ylabel('Loss')
axes[0].set_title('Training Progress')
axes[0].legend()
axes[0].grid(True, alpha=0.3)
# Right: Accuracy
axes[1].plot(epochs, val_accuracy, label='Val Accuracy', color='#2ca02c')
axes[1].set_xlabel('Epoch')
axes[1].set_ylabel('Accuracy')
axes[1].set_title('Validation Accuracy')
axes[1].legend()
axes[1].grid(True, alpha=0.3)
plt.tight_layout()
plt.savefig('training_metrics.png', dpi=150)
plt.show()
The subplots(1, 2) creates a figure with one row, two columns of axes. Axes objects are your drawing surface—call methods on them directly. figsize and dpi control dimensions and resolution.
Local verification checkpoint
Run the smallest example from this chapter in a local workspace and record the package version, runtime, data path, and observed output. If the result depends on model size, vector count, CPU/GPU backend, or available memory, note that constraint beside the exercise so the lesson remains reproducible.
Local verification checkpoint
Run the smallest example from this chapter in a local workspace and record the package version, runtime, data path, and observed output. If the result depends on model size, vector count, CPU/GPU backend, or available memory, note that constraint beside the exercise so the lesson remains reproducible.
Generate 100 random 2D points clustered around two centers. Plot them as a scatter plot with different colors for each cluster. Add a legend. Save as clusters.png.