Debu Sinha debu-sinha

Hey, I'm Debu Sinha

Lead Applied AI/ML Engineer (Solutions Architecture) @ Databricks | Author | Open Source Contributor

Building ML platforms at scale. Helping enterprises ship AI from prototype to production.

Tech Stack

Book

Practical Machine Learning on Databricks

Packt Publishing, 2023 | 244 pages

End-to-end guide for building production ML systems on Databricks - from data engineering to MLOps. Reached best seller status in its category within 2 weeks of release.

Research

Research Affiliate, Johns Hopkins University

AI Reliability Under Distribution Shift

A research program examining how AI systems fail when deployment conditions differ from training/calibration:

Paper	Focus	arXiv
The Semantic Illusion	Embedding-based hallucination detection fails on RLHF outputs (95% coverage to 100% FPR)	2512.15068

Other Publications

Paper	Venue
Demystifying Large Language Models	IJCET
Reinforcement Learning for Real-World Impact	IJSRCET
AI in Education: Opportunities and Challenges	IAEME
AI in Healthcare: Data to Patient Outcomes	IRJMETS

Open Source Contributions

MLflow Core Contributor

Active contributor to MLflow (23K+ stars) - the leading open-source ML lifecycle platform.

Third-Party Scorer Ecosystem

Building integrations that connect MLflow's GenAI evaluation with major LLM evaluation frameworks:

Integration	Status	PR	Ecosystem Reach
Phoenix (Arize)	Merged	#19473	500K+ monthly downloads
TruLens	In Review	#19492	100K+ monthly downloads
Guardrails AI	In Review	#20038	200K+ monthly downloads

UV Package Manager Support

Native UV integration for MLflow model logging - automatic dependency inference for UV-managed projects:

Feature	PR	Description
UV Support (Phase 1 + 2)	#20344	Auto-detection, `uv export`, dependency groups, UV sync

Design Doc | UV Issue #17702

Other Contributions

PR	Feature	Status
#19152	`inference_params` for LLM Judges (temperature, top_p)	Merged
#19248	Configurable parallelism (`MLFLOW_GENAI_EVAL_MAX_SCORER_WORKERS`)	Merged

MLflow-Modal Plugin

Serverless GPU deployment for MLflow models on Modal:

pip install mlflow-modal-deploy
mlflow deployments create -t modal -m models:/my-model/1 --name my-deployment

Auto-scaling from zero to thousands of GPUs (T4 to H200)
Sub-second cold starts with Modal's container snapshots
Native MLflow deployment interface

GitHub | PyPI

Speaking

TechFutures 2025 (NYC) - End-to-End MLOps Pipelines Workshop (GitHub)
Data Con LA 2022 - Simplifying AI/ML using Databricks Feature Store (YouTube)
Data Con LA 2021 - Detecting Fake Reviews at Scale using Spark and John Snow Labs (YouTube)
NYU Guest Lecture - ML Pipeline with Apache Spark

Provide feedback

Saved searches

Use saved searches to filter your results more quickly