Skip to content

CI/CD Patterns for ML

🤖 CI/CD Patterns for ML

In traditional software, CI/CD ensures code quality. In MLOps, it ensures Model Quality and Data Consistency.


🟢 Level 1: The CI Pipeline (Validation)

Triggered on every git push.

1. Code Linting & Unit Tests

  • Use Ruff for linting.
  • Use Pytest for utility functions (e.g., categorical encoding logic).

2. Data Validation

  • Use Pandera to check the schema of the incoming dataset.
  • Fail the CI if the percentage of null values in a critical column exceeds a threshold.

3. “Smoke Test” Training

Run a training job on a tiny sample (100 rows) to ensure the code doesn’t crash before launching a 10-hour GPU job.


🟡 Level 2: The CD Pipeline (Deployment)

Triggered when a model is tagged as Production in the Model Registry.

4. Shadow Deployment

  1. Deploy New Model (v2) to a separate endpoint.
  2. The Gateway sends requests to BOTH v1 and v2.
  3. Users only see v1.
  4. Log and compare v2 outputs. If v2 is stable, promote to primary.

5. Blue/Green Deployment

  • Green: The old production model.
  • Blue: The new model.
  • Switch the Load Balancer from Green to Blue instantly.

🔴 Level 3: Continuous Training (CT)

The process of automatically retraining models based on new data or performance drops.

6. The CT Architecture

  1. Trigger: A CloudWatch alert or an Airflow sensor detects new data in S3.
  2. Train: A training job is launched on SageMaker, Vertex AI, or Kubernetes.
  3. Validate: Compare vNew to vProduction.
  4. Promote: If vNew is better, update the Production tag in the Model Registry.