End to end ML Pipeline with training, deployment and real time inference.

A production-grade ML pipeline implementation using Kubeflow Pipelines (KFP) on Google Cloud Vertex AI. This project demonstrates MLOps best practices for automating end-to-end ML workflows.

Overview

This repository implements a complete ML pipeline for the Iris dataset classification problem, showcasing:

Automated data ingestion from BigQuery
Parallel model training (Decision Tree, Random Forest, XGBoost)
Automatic model evaluation and selection
Model registration and versioning in Vertex AI
Automated deployment to FastAPI services on Cloud Run
Batch inference capabilities
Real-time streaming inference with Dataflow
REST API serving with FastAPI

Key Features

Component-based Architecture: Modular, reusable pipeline components
Multi-model Training: Trains multiple models in parallel and selects the best performer
Cloud-native: Deep integration with Google Cloud services (Vertex AI, BigQuery, GCS)
Production-ready: Includes model versioning, schema validation, and deployment automation
Containerized: Each component runs in Docker containers with isolated dependencies

Project Structure

src/ml_pipelines_kfp/
├── iris_xgboost/           # Main Iris classification implementation
│   ├── pipelines/          # KFP pipeline definitions
│   │   ├── components/     # Reusable pipeline components
│   │   ├── iris_pipeline_training.py
│   │   └── iris_pipeline_inference.py
│   ├── models/             # Pydantic models for API
│   ├── server.py           # FastAPI serving application
│   ├── bq_dataloader.py    # BigQuery data loading utility
│   └── constants.py        # Configuration constants
├── workflows/              # Alternative workflow implementations
└── notebooks/              # Example notebooks and experiments
├── dataflow/               # Dataflow streaming pipelines
│   └── iris_streaming_pipeline.py
schemas/                    # Input/output schemas for Vertex AI
Dockerfile                  # Container definition
pyproject.toml              # Project dependencies
pipeline.yaml               # Pipeline configuration
deploy_dataflow_streaming.sh # Dataflow streaming deployment script

Prerequisites

Python 3.9-3.10
Google Cloud Project with enabled APIs:
- Vertex AI
- BigQuery
- Cloud Storage
Service account with appropriate permissions
uv package manager (for dependency management)

Installation

# Clone the repository
git clone <repository-url>
cd ml_pipelines_kfp

# Install dependencies
uv pip install -e .

Usage

1. Load Training Data to BigQuery

# Set up credentials and load Iris dataset
./src/ml_pipelines_kfp/iris_xgboost/load_data.sh

2. Run Training Pipeline

# Execute the training pipeline on Vertex AI
python src/ml_pipelines_kfp/iris_xgboost/pipelines/iris_pipeline_training.py

This will:

Load data from BigQuery
Train Decision Tree and Random Forest models in parallel
Evaluate and select the best model
Register the model in Vertex AI Model Registry with "blessed" alias
Deploy the blessed model to FastAPI service on Cloud Run

3. Run Batch Inference Pipeline

# Execute batch inference
python src/ml_pipelines_kfp/iris_xgboost/pipelines/iris_pipeline_inference.py

4. Real-time Streaming Inference

Deploy a Dataflow streaming job for real-time inference:

# Deploy streaming pipeline (update SERVICE_URL with actual Cloud Run URL)
./deploy_dataflow_streaming.sh

Start generating test data:

# Run data producer to send samples to Pub/Sub
python src/ml_pipelines_kfp/iris_xgboost/pubsub_producer.py

Development

Code Quality

# Format code
black src/

# Lint
ruff check src/

# Type checking
mypy src/

Architecture

The project follows a component-based architecture where each ML pipeline step is a self-contained KFP component:

Data Component: Loads and splits data from BigQuery
Model Components: Implements various ML algorithms
Evaluation Component: Compares model performance
Registry Component: Manages model versioning with "blessed" aliases
Deployment Component: Deploys blessed models to Cloud Run FastAPI services
Inference Component: Performs batch predictions
Streaming Component: Real-time inference via Dataflow and Pub/Sub

Configuration

Key configuration is managed in src/ml_pipelines_kfp/iris_xgboost/constants.py:

Project ID: deeplearning-sahil
Region: us-central1
Dataset: ml_datasets.iris
Model naming and versioning

CI/CD

The repository includes GitHub Actions workflow (.github/workflows/cicd.yaml) when pushing to main branch that:

Builds Docker images for KFP components
Builds generic FastAPI inference containers
Pushes to Google Artifact Registry
Triggers on pushes to main branch

Technologies

Orchestration: Kubeflow Pipelines 2.8.0
Cloud Platform: Google Cloud (Vertex AI, BigQuery, GCS, Cloud Run, Dataflow)
ML Frameworks: scikit-learn, XGBoost
API Framework: FastAPI
Streaming: Apache Beam, Dataflow, Pub/Sub
Data Processing: Pandas, Polars, Dask
Package Management: uv, Hatchling

Deployment Architecture

Model Deployment Strategy

The project uses a blessed model pattern for production deployments:

Training Pipeline: Trains multiple models and selects the best performer
Model Registry: Stores the winning model in Vertex AI with "blessed" alias
Deployment Pipeline: Automatically deploys only "blessed" models to production
Cost Optimization: Uses FastAPI on Cloud Run

Streaming Architecture

Real-time inference is handled through:

Data Ingestion: Pub/Sub receives real-time inference requests
Stream Processing: Dataflow processes messages and calls FastAPI services
Model Serving: Cloud Run hosts FastAPI containers with blessed models
Results Storage: Predictions are written to BigQuery for monitoring

Key Benefits

Cost Effective: Cloud Run FastAPI services cost ~90% less than Vertex AI endpoints
Scalable: Dataflow auto-scales based on Pub/Sub message volume
Reliable: Only production-ready "blessed" models are deployed
Observable: All predictions logged to BigQuery with metadata

Name		Name	Last commit message	Last commit date
Latest commit History 61 Commits
.github/workflows		.github/workflows
schemas/iris_xgboost/vertex		schemas/iris_xgboost/vertex
scripts		scripts
src/ml_pipelines_kfp		src/ml_pipelines_kfp
test		test
.gitignore		.gitignore
Dockerfile		Dockerfile
Readme.md		Readme.md
pipeline-iris-pubsub-inference.yaml		pipeline-iris-pubsub-inference.yaml
pipeline.yaml		pipeline.yaml
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

End to end ML Pipeline with training, deployment and real time inference.

Overview

Key Features

Project Structure

Prerequisites

Installation

Usage

1. Load Training Data to BigQuery

2. Run Training Pipeline

3. Run Batch Inference Pipeline

4. Real-time Streaming Inference

Development

Code Quality

Architecture

Configuration

CI/CD

Technologies

Deployment Architecture

Model Deployment Strategy

Streaming Architecture

Key Benefits

About

Uh oh!

Releases

Packages

Uh oh!

Languages

shlbatra/Machine-learning-Ops-Deployment-Inference

Folders and files

Latest commit

History

Repository files navigation

End to end ML Pipeline with training, deployment and real time inference.

Overview

Key Features

Project Structure

Prerequisites

Installation

Usage

1. Load Training Data to BigQuery

2. Run Training Pipeline

3. Run Batch Inference Pipeline

4. Real-time Streaming Inference

Development

Code Quality

Architecture

Configuration

CI/CD

Technologies

Deployment Architecture

Model Deployment Strategy

Streaming Architecture

Key Benefits

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages