A production-grade ML pipeline implementation using Kubeflow Pipelines (KFP) on Google Cloud Vertex AI. This project demonstrates MLOps best practices for automating end-to-end ML workflows.
This repository implements a complete ML pipeline for the Iris dataset classification problem, showcasing:
- Automated data ingestion from BigQuery
- Parallel model training (Decision Tree, Random Forest, XGBoost)
- Automatic model evaluation and selection
- Model registration and versioning in Vertex AI
- Automated deployment to FastAPI services on Cloud Run
- Batch inference capabilities
- Real-time streaming inference with Dataflow
- REST API serving with FastAPI
- Component-based Architecture: Modular, reusable pipeline components
- Multi-model Training: Trains multiple models in parallel and selects the best performer
- Cloud-native: Deep integration with Google Cloud services (Vertex AI, BigQuery, GCS)
- Production-ready: Includes model versioning, schema validation, and deployment automation
- Containerized: Each component runs in Docker containers with isolated dependencies
src/ml_pipelines_kfp/
├── iris_xgboost/ # Main Iris classification implementation
│ ├── pipelines/ # KFP pipeline definitions
│ │ ├── components/ # Reusable pipeline components
│ │ ├── iris_pipeline_training.py
│ │ └── iris_pipeline_inference.py
│ ├── models/ # Pydantic models for API
│ ├── server.py # FastAPI serving application
│ ├── bq_dataloader.py # BigQuery data loading utility
│ └── constants.py # Configuration constants
├── workflows/ # Alternative workflow implementations
└── notebooks/ # Example notebooks and experiments
├── dataflow/ # Dataflow streaming pipelines
│ └── iris_streaming_pipeline.py
schemas/ # Input/output schemas for Vertex AI
Dockerfile # Container definition
pyproject.toml # Project dependencies
pipeline.yaml # Pipeline configuration
deploy_dataflow_streaming.sh # Dataflow streaming deployment script
- Python 3.9-3.10
- Google Cloud Project with enabled APIs:
- Vertex AI
- BigQuery
- Cloud Storage
- Service account with appropriate permissions
uvpackage manager (for dependency management)
# Clone the repository
git clone <repository-url>
cd ml_pipelines_kfp
# Install dependencies
uv pip install -e .# Set up credentials and load Iris dataset
./src/ml_pipelines_kfp/iris_xgboost/load_data.sh# Execute the training pipeline on Vertex AI
python src/ml_pipelines_kfp/iris_xgboost/pipelines/iris_pipeline_training.pyThis will:
- Load data from BigQuery
- Train Decision Tree and Random Forest models in parallel
- Evaluate and select the best model
- Register the model in Vertex AI Model Registry with "blessed" alias
- Deploy the blessed model to FastAPI service on Cloud Run
# Execute batch inference
python src/ml_pipelines_kfp/iris_xgboost/pipelines/iris_pipeline_inference.pyDeploy a Dataflow streaming job for real-time inference:
# Deploy streaming pipeline (update SERVICE_URL with actual Cloud Run URL)
./deploy_dataflow_streaming.shStart generating test data:
# Run data producer to send samples to Pub/Sub
python src/ml_pipelines_kfp/iris_xgboost/pubsub_producer.py# Format code
black src/
# Lint
ruff check src/
# Type checking
mypy src/The project follows a component-based architecture where each ML pipeline step is a self-contained KFP component:
- Data Component: Loads and splits data from BigQuery
- Model Components: Implements various ML algorithms
- Evaluation Component: Compares model performance
- Registry Component: Manages model versioning with "blessed" aliases
- Deployment Component: Deploys blessed models to Cloud Run FastAPI services
- Inference Component: Performs batch predictions
- Streaming Component: Real-time inference via Dataflow and Pub/Sub
Key configuration is managed in src/ml_pipelines_kfp/iris_xgboost/constants.py:
- Project ID:
deeplearning-sahil - Region:
us-central1 - Dataset:
ml_datasets.iris - Model naming and versioning
The repository includes GitHub Actions workflow (.github/workflows/cicd.yaml) when pushing to main branch that:
- Builds Docker images for KFP components
- Builds generic FastAPI inference containers
- Pushes to Google Artifact Registry
- Triggers on pushes to main branch
- Orchestration: Kubeflow Pipelines 2.8.0
- Cloud Platform: Google Cloud (Vertex AI, BigQuery, GCS, Cloud Run, Dataflow)
- ML Frameworks: scikit-learn, XGBoost
- API Framework: FastAPI
- Streaming: Apache Beam, Dataflow, Pub/Sub
- Data Processing: Pandas, Polars, Dask
- Package Management: uv, Hatchling
The project uses a blessed model pattern for production deployments:
- Training Pipeline: Trains multiple models and selects the best performer
- Model Registry: Stores the winning model in Vertex AI with "blessed" alias
- Deployment Pipeline: Automatically deploys only "blessed" models to production
- Cost Optimization: Uses FastAPI on Cloud Run
Real-time inference is handled through:
- Data Ingestion: Pub/Sub receives real-time inference requests
- Stream Processing: Dataflow processes messages and calls FastAPI services
- Model Serving: Cloud Run hosts FastAPI containers with blessed models
- Results Storage: Predictions are written to BigQuery for monitoring
- Cost Effective: Cloud Run FastAPI services cost ~90% less than Vertex AI endpoints
- Scalable: Dataflow auto-scales based on Pub/Sub message volume
- Reliable: Only production-ready "blessed" models are deployed
- Observable: All predictions logged to BigQuery with metadata