Lead Applied AI/ML Engineer (Solutions Architecture) @ Databricks | Author | Open Source Contributor
Building ML platforms at scale. Helping enterprises ship AI from prototype to production.
Packt Publishing, 2023 | 244 pages
End-to-end guide for building production ML systems on Databricks - from data engineering to MLOps. Reached best seller status in its category within 2 weeks of release.
Research Affiliate, Johns Hopkins University
A research program examining how AI systems fail when deployment conditions differ from training/calibration:
| Paper | Focus | arXiv |
|---|---|---|
| The Semantic Illusion | Embedding-based hallucination detection fails on RLHF outputs (95% coverage to 100% FPR) | 2512.15068 |
| Paper | Venue |
|---|---|
| Demystifying Large Language Models | IJCET |
| Reinforcement Learning for Real-World Impact | IJSRCET |
| AI in Education: Opportunities and Challenges | IAEME |
| AI in Healthcare: Data to Patient Outcomes | IRJMETS |
Active contributor to MLflow (23K+ stars) - the leading open-source ML lifecycle platform.
Building integrations that connect MLflow's GenAI evaluation with major LLM evaluation frameworks:
| Integration | Status | PR | Ecosystem Reach |
|---|---|---|---|
| Phoenix (Arize) | Merged | #19473 | 500K+ monthly downloads |
| TruLens | In Review | #19492 | 100K+ monthly downloads |
| Guardrails AI | In Review | #20038 | 200K+ monthly downloads |
Native UV integration for MLflow model logging - automatic dependency inference for UV-managed projects:
| Feature | PR | Description |
|---|---|---|
| UV Support (Phase 1 + 2) | #20344 | Auto-detection, uv export, dependency groups, UV sync |
| PR | Feature | Status |
|---|---|---|
| #19152 | inference_params for LLM Judges (temperature, top_p) |
Merged |
| #19248 | Configurable parallelism (MLFLOW_GENAI_EVAL_MAX_SCORER_WORKERS) |
Merged |
Serverless GPU deployment for MLflow models on Modal:
pip install mlflow-modal-deploy
mlflow deployments create -t modal -m models:/my-model/1 --name my-deployment- Auto-scaling from zero to thousands of GPUs (T4 to H200)
- Sub-second cold starts with Modal's container snapshots
- Native MLflow deployment interface
- TechFutures 2025 (NYC) - End-to-End MLOps Pipelines Workshop (GitHub)
- Data Con LA 2022 - Simplifying AI/ML using Databricks Feature Store (YouTube)
- Data Con LA 2021 - Detecting Fake Reviews at Scale using Spark and John Snow Labs (YouTube)
- NYU Guest Lecture - ML Pipeline with Apache Spark




