Skip to content

Drift Detection & Monitoring

📊 Drift Detection & Monitoring

In MLOps, monitoring is not just about uptime; it’s about Predictive Integrity. We must detect when the model starts lying to us.


🟢 Level 1: Data Drift (Feature Drift)

The distribution of input data changes between training and production.

1. Statistical Tests

  • KS Test (Kolmogorov-Smirnov): Used for continuous numerical features.
  • Chi-Squared Test: Used for categorical features.
  • PSI (Population Stability Index): Measures how much a distribution has shifted.

2. Metrics for Drift

  • Mean/Median Shift: The center of the distribution has moved.
  • Variance Shift: The data has become more or less “noisy.”

🟡 Level 2: Concept Drift

The relationship between the input (XX) and the output (yy) changes.

3. The “Luxury” Example

  • Training: Feature Price > $1000 leads to label Luxury.
  • Production (Inflation): Feature Price > $1000 is now considered Mid-Range.
  • Result: The model still predicts Luxury, which is now wrong.

🔴 Level 3: Monitoring Stack

To detect drift at scale, you need an Observability Pipeline.

4. Component Stack

  • Data Capture: Logging requests and responses to a DB (e.g., MongoDB or BigQuery).
  • Drift Engine: Evidently AI or Great Expectations.
  • Metrics Store: Prometheus.
  • Visualization: Grafana (dashboards for Data Scientists).

5. Automated Response

When drift is detected:

  1. Trigger Slack Alert.
  2. Tag the production model as “Degraded.”
  3. Trigger a Continuous Training (CT) pipeline to update the model.

Every production feature should have a “Baseline Profile” stored in the Model Registry. The monitor should compare the “Live Profile” (last 1 hour of traffic) to the “Baseline Profile” every 10 minutes.