KEY INSIGHT
The Pareto frontier reveals the optimal trade-off curve between model accuracy and size, enabling informed decisions about which compression configurations to pursue.
Understanding the relationship between accuracy and model size is essential for choosing compression strategies. The Pareto frontier identifies configurations where no improvement in one metric is possible without sacrificing the other.
### Computing the Frontier
Generate multiple compression configurations spanning a wide range of target sizes, then plot accuracy versus model size:
```python
def compute_pareto_frontier(model, compression_configs, test_loader):
results = []
for config in compression_configs:
compressed = apply_compression(model, config)
accuracy = evaluate(compressed, test_loader)
model_size = count_parameters(compressed) * config['bits'] / 8
results.append({
'accuracy': accuracy,
'size_mb': model_size,
'config': config
})
# Sort by accuracy descending
results.sort(key=lambda x: x['accuracy'], reverse=True)
# Identify Pareto-optimal points
pareto_frontier = []
max_size_seen = 0
for r in results:
# A point is Pareto-optimal if no other point has both
# higher accuracy AND smaller size
if r['size_mb'] >= max_size_seen:
# Check if any point dominates this one
is_dominated = any(
other['accuracy'] > r['accuracy'] and
other['size_mb'] < r['size_mb']
for other in results
)
if not is_dominated:
pareto_frontier.append(r)
max_size_seen = r['size_mb']
return pareto_frontier
```
### Visualization
```python
import matplotlib.pyplot as plt
def plot_pareto_frontier(results, pareto_points):
plt.figure(figsize=(10, 6))
# Plot all points
sizes = [r['size_mb'] for r in results]
accuracies = [r['accuracy'] for r in results]
plt.scatter(sizes, accuracies, alpha=0.5, label='All configurations')
# Highlight Pareto frontier
frontier_sizes = [p['size_mb'] for p in pareto_points]
frontier_accs = [p['accuracy'] for p in pareto_points]
plt.plot(frontier_sizes, frontier_accs, 'r-', linewidth=2, label='Pareto frontier')
plt.scatter(frontier_sizes, frontier_accs, c='red', s=100, zorder=5)
plt.xlabel('Model Size (MB)')
plt.ylabel('Accuracy (%)')
plt.legend()
plt.grid(True, alpha=0.3)
plt.savefig('pareto_frontier.png')
```
### Interpreting the Frontier
The frontier reveals several key insights:
1. **Diminishing returns**: Moving along the frontier from large to small models, accuracy drops slowly at first, then steeply as you approach the frontier's knee
2. **Compression headroom**: Points far from the frontier indicate inefficient compression—these configurations underperform relative to what's achievable
3. **Optimal operating points**: The knee of the frontier (where small size increases come at large accuracy costs) often represents the best deployment choice
```python
def find_knee(frontier_points):
"""Find the knee point where the frontier has maximum curvature."""
import numpy as np
sizes = np.array([p['size_mb'] for p in frontier_points])
accuracies = np.array([p['accuracy'] for p in frontier_points])
# Normalize to [0, 1] range
sizes_norm = (sizes - sizes.min()) / (sizes.max() - sizes.min())
accuracies_norm = (accs - accuracies.min()) / (accuracies.max() - accuracies.min())
# Compute second derivative (curvature)
# Higher curvature = knee region
curvatures = np.gradient(np.gradient(accuracies_norm))
knee_idx = np.argmax(np.abs(curvatures))
return frontier_points[knee_idx]
```
### Multi-Objective Frontier
When optimizing beyond size and accuracy (e.g., latency, power consumption), use multi-objective optimization to generate the full Pareto set:
```python
from pymoo.optimize import minimize
from pymoo.problems.multi import get_problem
def multi_objective_frontier():
problem = get_problem("dtlz1", n_var=10, n_obj=3) # 3 objectives
algorithm = NSGA2(
pop_size=100,
elimination_duplicates=False
)
result = minimize(
problem,
algorithm,
('n_gen', 200),
seed=1,
verbose=False
)
return result.F # Pareto front approximation
```
### Practical Usage
Before committing to a compression configuration:
1. Generate the Pareto frontier across your design space
2. Identify the knee point as the default choice
3. Adjust toward smaller or larger models based on deployment constraints
4. Verify that chosen configurations remain on the frontier with validation data