SageMaker Modelling Pipeline
Train · Evaluate · Register (MLOps quality gates)
Fully automated ML training pipeline on AWS: data preprocessing, XGBoost training, evaluation with conditional quality gate (MSE), and governed model registration in SageMaker Model Registry. Triggered via GitHub Actions + OIDC.
Project Summary
MLOps · Pipeline-2 (Train/Evaluate/Register)
AI/ML type
Supervised regression (XGBoost) + automated MLOps
Domain
AWS SageMaker · Model building & registration
DevOps focus
CI/CD for ML (GitHub Actions + OIDC + SageMaker Pipelines)
Key technologies
AWS + CI/CD + containers
Problem & Objective
Why this pipeline?
Problems solved
- Manual training → non‑reproducible experiments, inconsistent evaluation
- No quality gates → bad models could reach production
- Lack of governance & traceability
Primary objectives
- Fully automated train/eval/register pipeline on AWS
- Conditional registration based on MSE threshold
- Integrate CI/CD with GitHub Actions + OIDC (no static secrets)
Solution & Architecture
SageMaker Pipeline · quality gate
Overview
SageMaker Pipeline with steps: Process → Train → Evaluate → Conditional Register. Triggered via GitHub Actions (Linux runner) that assumes IAM role through OIDC. Evaluation metrics (MSE) are written to JSON; only models that pass threshold are registered in SageMaker Model Registry with approval workflow enabled. Artifacts stored in S3, images in ECR.
Skills & Technologies
ML engineering + AWS
Primary skills
- Amazon SageMaker Pipelines (advanced)
- MLOps: training/evaluation/registry (advanced)
- CI/CD with GitHub Actions + OIDC (advanced)
- Docker, ECR, IAM, CloudWatch
Languages & tools
- Python (SDK, processing/training scripts)
- YAML (GitHub workflows, pipeline config)
- XGBoost / scikit-learn
Pipeline execution & governance
conditional registration + approvals
Execution
- Trigger: push to main or workflow_dispatch
- GitHub runner: ubuntu-latest, assumes IAM role via OIDC
- SageMaker Pipeline execution: processing (prep), training (XGBoost), evaluation (MSE to JSON)
- Conditional step: if metrics pass → register model (status pending/manual approval)
Governance & controls
- Quality gate on MSE – auto‑reject underperforming models
- Model Registry approval workflow (manual approval step before production)
- Least‑privilege IAM roles + KMS encryption for S3
- CloudWatch logs for every job (traceability)
AWS CI/CD · YAML mapping
GitHub Actions → SageMaker Pipeline
| Architecture block | AWS / YAML construct |
|---|---|
| Source repository | GitHub (modeling / pipeline repository) |
| CI trigger | on: push / workflow_dispatch |
| Runner | GitHub Actions Linux runner (ubuntu-latest) |
| Authentication | aws-actions/configure-aws-credentials (OIDC → IAM Role) |
| Runtime setup | setup-python (SageMaker Pipeline SDK + dependencies) |
| Dependency install | pip install requirements |
| Pipeline packaging | Package pipeline code and dependencies |
| Pipeline create/update | run-pipeline creates or updates SageMaker Pipeline |
| Pipeline execution | Start SageMaker Pipeline execution |
| Processing | SageMaker Processing Jobs for preprocessing and evaluation |
| Training | SageMaker Training Jobs train model and store artifacts in S3 |
| Evaluation metrics | metrics.json written to S3 and checked against quality gates |
| Conditional gate | Condition step blocks failed models from registration |
| Model registry | SageMaker Model Registry registration when thresholds pass |
| Approval status | Model version set to Pending/Approved for downstream Pipeline‑3 deployment |
| Artifact storage | S3 (datasets, model.tar.gz, metrics.json) |
| Container registry | Amazon ECR (custom images) |
| Encryption | KMS encryption for S3 artifacts (optional) |
| Publish status | Publish metrics / pipeline status back to CI |
| Logs | GitHub Actions + SageMaker Pipelines + CloudWatch Logs |
YAML steps: checkout, configure AWS credentials, setup-python, pip install requirements, run-pipeline, start execution, wait for completion, publish metrics/status.
AWS DevOps CI/CD – Reference Architecture
Pipeline-2: Modelling / Train–Evaluate–Register
Architecture Flow:
- Developer pushes changes to GitHub (modeling / pipeline repository).
- GitHub Actions workflow is triggered (push / manual dispatch).
- GitHub Actions assumes AWS IAM Role via OIDC (no static credentials).
- CI job packages pipeline code and dependencies.
- SageMaker Pipeline is created/updated (Process → Train → Evaluate → Condition).
- SageMaker Processing Jobs run data preprocessing and evaluation.
- SageMaker Training Jobs train models and store artifacts in S3.
- Evaluation metrics are written to JSON and checked against quality gates.
- If metrics pass thresholds, model is registered in SageMaker Model Registry.
- Model version is set to Pending/Approved for downstream deployment (Pipeline-3).
- Logs and status are available in GitHub Actions + SageMaker Pipelines + CloudWatch.
YAML Mapping (GitHub Actions → SageMaker Pipelines):
- Trigger: on push / workflow_dispatch
- Auth: aws-actions/configure-aws-credentials (OIDC → IAM Role)
- Runtime: setup-python (pipeline SDK + deps)
- Steps:
- pip install requirements
- run-pipeline (create/update SageMaker Pipeline)
- start pipeline execution
- wait for completion (optional)
- publish metrics / status
Security & Guardrails:
- OIDC-based IAM roles for CI (no secrets in CI)
- Least-privilege IAM roles for SageMaker jobs (processing/training)
- KMS encryption for S3 artifacts (optional)
- Quality gates on evaluation metrics (block bad models)
- Model Registry approvals before production promotion
Assets & references
Code, diagrams, study material
MLOps CDK GitHub Action
Infrastructure-as-code and GitHub Actions workflow for platform provisioning.
View GitHub repoPsitron ML Build
Build repository for the ML pipeline stage and related implementation assets.
View GitHub repoPsitron ML Deploy
Deployment repository for ML release workflows and production delivery assets.
View GitHub repo