RUNLOCALAIv38
->Will it run?Best GPUCompareTroubleshootStartLearnPulseModelsHardwareToolsBench
Run check
RUNLOCALAI

Independently operated catalog for local-AI hardware and software. Hand-written verdicts. Source-cited claims. Reproducible commands when we have them.

OP·Fredoline Eruo
DIR
  • Models
  • Hardware
  • Tools
  • Benchmarks
TOOLS
  • Will it run?
  • Compare hardware
  • Cost vs cloud
  • Choose my GPU
  • Prompting kits
  • Quick answers
REF
  • All buyer guides
  • Learn local AI
  • Methodology
  • Glossary
  • Errors KB
  • Trust
EDITOR
  • About
  • Author
  • How we make money
  • Editorial policy
  • Contact
LEGAL
  • Privacy
  • Terms
  • Sitemap
MAIL · MONTHLY DIGEST
Get monthly local AI changes
Monthly recap. No spam.
DISCLOSURE

Some links on this site are affiliate links (Amazon Associates and other first-class retailers). When you buy through them, we earn a small commission at no extra cost to you. Affiliate links do not influence our verdicts — there are cards we rate highly that we don't have affiliate relationships with, and cards that sell well that we refuse to recommend. Read more →

© 2026 runlocalai.coIndependently operated
RUNLOCALAI · v38
  1. >
  2. Home
  3. /Learn
  4. /Courses
  5. /MLOps for Local AI
  6. /Ch. 13
MLOps for Local AI

13. Drift Detection

Chapter 13 of 24 · 15 min
KEY INSIGHT

Drift detection is the practice of identifying when your model's operating environment diverges from its training conditions. In local AI deployments, where models serve specific user populations over extended periods, drift is not theoretical—it's inevitable. ### Understanding Drift in Context Drift occurs when the statistical properties of your input data, output predictions, or the underlying problem itself change over time. Unlike cloud deployments where retraining pipelines can trigger automatically, local AI operators must build explicit observation into the serving stack. Drift compounds silently. A model that performs adequately in month one may degrade to dangerous territory by month six without any external indication. Users adapt their queries, your data distribution shifts, or the real-world phenomenon you're predicting fundamentally changes. ### Detection Approaches There are three primary drift detection approaches: **Statistical tests** compare feature distributions between a reference period and current observations. The Kolmogorov-Smirnov test measures maximum distance between cumulative distribution functions. The Chi-squared test evaluates categorical feature shifts. These are lightweight to compute and suitable for deployment on edge devices. **Distance-based methods** calculate divergence between probability distributions. KL divergence, Jensen-Shannon distance, and Wasserstein distance each offer different sensitivity profiles. Lower computational overhead than full statistical tests, but require choosing appropriate thresholds empirically. **Sequential methods** track performance metrics over time, treating drift detection as a change-point problem. Page-Hinkley test and CUSUM (cumulative sum) detect statistically significant shifts in monitored statistics. ### Implementation Considerations Local deployment constraints shape your drift detection architecture. You cannot stream infinite data to a central server for batch analysis. Instead, implement rolling window statistics computed on-device with lightweight reporting to a central dashboard. ```python # Python: Rolling window drift detection using Wasserstein distance import numpy as np from scipy.stats import wasserstein_distance class RollingDriftDetector: def __init__(self, window_size: int = 1000, threshold: float = 0.15): self.reference_window = [] self.current_window = [] self.window_size = window_size self.threshold = threshold self.drift_detected = False def add_sample(self, features: np.ndarray, prediction: float): """Add a sample from current serving traffic.""" # Compress features for storage efficiency sample = np.concatenate([features.flatten(), [prediction]]) self.current_window.append(sample) if len(self.current_window) > self.window_size: self.current_window.pop(0) def set_reference(self, reference_data: list): """Set reference distribution from training or last validation.""" self.reference_window = reference_data def check_drift(self) -> tuple[bool, float]: """Check if drift exceeds threshold. Returns (drifted, distance).""" if len(self.current_window) < 100: return False, 0.0 # Insufficient data current_mean = np.mean(self.current_window, axis=0) reference_mean = np.mean(self.reference_window, axis=0) distance = wasserstein_distance(current_mean, reference_mean) self.drift_detected = distance > self.threshold return self.drift_detected, distance ``` ### Practical Limitations Drift detection without ground truth labels is inherently limited. You detect distribution changes, not performance degradation. A drifted model might still perform acceptably, or a stable-looking distribution might mask catastrophic performance collapse. Pair statistical drift detection with user feedback mechanisms where possible.


EXERCISE

Implement a rolling drift detector using the KL divergence method. Store a reference distribution from your initial training data, then implement a scheduled check that logs drift measurements and warns when divergence exceeds your defined threshold. Validate by injecting synthetic drift (scaling features by a factor) and confirming detection.

← Chapter 12
Model Validation
Chapter 14 →
Data Drift