RideOps Cancellation Command Center - A complete data pipeline and ML analytics solution for ride-sharing operations.
This project provides:
- 🔄 Bronze → Silver → Gold data pipeline (Delta Lake)
- 🤖 ML Model for cancellation prediction
- 📊 Lakeview Dashboard for real-time monitoring
- 🚀 100% Programmatic Deployment (no UI required)
-
Install UV: https://docs.astral.sh/uv/getting-started/installation/
-
Install the Databricks CLI from https://docs.databricks.com/dev-tools/cli/databricks-cli.html
-
Authenticate to your Databricks workspace, if you have not done so already:
$ databricks configure -
To deploy a development copy of this project, type:
$ databricks bundle deploy --target dev(Note that "dev" is the default target, so the
--targetparameter is optional here.)This deploys everything that's defined for this project. For example, the default template would deploy a job called
[dev yourname] databricks_hackathon_jobto your workspace. You can find that job by opening your workpace and clicking on Workflows. -
Similarly, to deploy a production copy, type:
$ databricks bundle deploy --target prodNote that the default job from the template has a schedule that runs every day (defined in resources/databricks_hackathon.job.yml). The schedule is paused when deploying in development mode (see https://docs.databricks.com/dev-tools/bundles/deployment-modes.html).
-
To run a job or pipeline, use the "run" command:
$ databricks bundle run -
Optionally, install the Databricks extension for Visual Studio code for local development from https://docs.databricks.com/dev-tools/vscode-ext.html. It can configure your virtual environment and setup Databricks Connect for running unit tests locally. When not using these tools, consult your development environment's documentation and/or the documentation for Databricks Connect for manually setting up your environment (https://docs.databricks.com/en/dev-tools/databricks-connect/python/index.html).
-
For documentation on the Databricks asset bundles format used for this project, and for CI/CD configuration, see https://docs.databricks.com/dev-tools/bundles/index.html.
Deploy the Cancellation Command Center dashboard programmatically:
# Install SDK
pip install databricks-sdk
# Set credentials
export DATABRICKS_HOST="https://adb-2580806725893634.14.azuredatabricks.net"
export DATABRICKS_TOKEN="your_token_here"
# Deploy dashboard
python deploy_lakeview_dashboard.py- 10 interactive widgets across 2 pages
- Real-time metrics: Total bookings, cancellation rate, revenue at risk
- ML insights: High-risk zones, peak hours analysis
- Visualizations: Heatmaps, trends, performance tables
- 📖 Quick Start: See
QUICK_DEPLOY.md - 📚 Full Guide: See
LAKEVIEW_DEPLOYMENT_GUIDE.md - ✅ Setup Summary: See
DASHBOARD_SETUP_COMPLETE.md
- Python Script (recommended) - Auto-detects warehouse
- Databricks Bundle - Infrastructure as code
- REST API - Custom automation
- Terraform - Multi-cloud IaC
See LAKEVIEW_DEPLOYMENT_GUIDE.md for detailed instructions.
databricks_hackathon/
├── notebooks/ # Data pipeline notebooks
│ ├── 00_setup.py # Initial setup
│ ├── 01_bronze_ingestion.py # Raw data ingestion
│ ├── 02_silver_transformation.py # Data cleaning
│ ├── 03_aggregate_gold.py # Feature engineering
│ └── 04_train_ml_model.py # ML model training
├── resources/ # Dashboard & configs
│ ├── lakeview_dashboard.lvdash.json # Dashboard definition
│ ├── databricks_hackathon.dashboard.yml # Bundle config
│ ├── lakeview_dashboard.sql # SQL-based setup
│ └── *.yml # Job/pipeline configs
├── deploy_lakeview_dashboard.py # Deployment script
└── docs/
├── QUICK_DEPLOY.md
├── LAKEVIEW_DEPLOYMENT_GUIDE.md
└── DASHBOARD_SETUP_COMPLETE.md
- Bronze: Raw CSV ingestion
- Silver: Data cleaning, normalization, feature engineering
- Gold: Aggregated metrics for analytics
- ML: Cancellation prediction model
- ✅ Medallion architecture (Bronze/Silver/Gold)
- ✅ Real-time dashboard with 10 widgets
- ✅ ML model for cancellation prediction
- ✅ Programmatic deployment (no UI)
- ✅ Databricks Asset Bundle integration
- ✅ Complete documentation