chore(spark): migrate SDK to kubeflow_spark_api Pydantic models#295
chore(spark): migrate SDK to kubeflow_spark_api Pydantic models#295tariq-hasan wants to merge 4 commits intokubeflow:mainfrom
Conversation
|
🎉 Welcome to the Kubeflow SDK! 🎉 Thanks for opening your first PR! We're happy to have you as part of our community 🚀 Here's what happens next:
Join the community:
Feel free to ask questions in the comments if you need any help or clarification! |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
Pull Request Test Coverage Report for Build 22043884336Details
💛 - Coveralls |
There was a problem hiding this comment.
Pull request overview
This PR refactors the Spark SDK to use typed Pydantic models from kubeflow_spark_api instead of raw dictionaries for CRD construction. This aligns the Spark SDK with the established architecture pattern used by the Trainer SDK.
Changes:
- Added
kubeflow-spark-api>=2.3.0dependency and migrated from dict-based CRD construction to typed Pydantic models - Updated all option implementations to work with Pydantic models instead of dictionaries
- Refactored backend methods to convert to/from Pydantic models at the Kubernetes API boundary
Reviewed changes
Copilot reviewed 8 out of 9 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| pyproject.toml | Added kubeflow-spark-api>=2.3.0 dependency |
| uv.lock | Lock file updates for new kubeflow-spark-api dependency |
| kubeflow/spark/backends/kubernetes/backend.py | Convert between dict and Pydantic models at API boundary using .to_dict() and .from_dict() |
| kubeflow/spark/backends/kubernetes/utils.py | Refactored build_spark_connect_crd to return Pydantic model; renamed parse_spark_connect_status to get_spark_connect_info_from_cr with Pydantic input |
| kubeflow/spark/types/options.py | Updated all option callables to accept SparkConnect Pydantic model instead of dict |
| kubeflow/spark/backends/kubernetes/backend_test.py | Enhanced mock responses to include all required fields for Pydantic model validation |
| kubeflow/spark/backends/kubernetes/utils_test.py | Updated tests to work with Pydantic models and added validation test for invalid CR |
| kubeflow/spark/types/options_test.py | Migrated tests to use spark_connect_model fixture and verify Pydantic model attributes |
| hack/Dockerfile.spark-e2e-runner | Added --pre flag to allow installation of pre-release versions |
| role_spec.template = models.IoK8sApiCoreV1PodTemplateSpec() | ||
|
|
||
| # Convert existing template to dict, merge, and convert back | ||
| existing_dict = role_spec.template.to_dict() if role_spec.template else {} |
There was a problem hiding this comment.
Redundant None check in ternary expression. Since role_spec.template is guaranteed to be non-None after line 193, the ternary expression role_spec.template.to_dict() if role_spec.template else {} can be simplified to role_spec.template.to_dict().
| existing_dict = role_spec.template.to_dict() if role_spec.template else {} | |
| existing_dict = role_spec.template.to_dict() |
Signed-off-by: tariq-hasan <mmtariquehsn@gmail.com>
Signed-off-by: tariq-hasan <mmtariquehsn@gmail.com>
Signed-off-by: tariq-hasan <mmtariquehsn@gmail.com>
Signed-off-by: tariq-hasan <mmtariquehsn@gmail.com>
2df7db9 to
255b2ad
Compare
What this PR does / why we need it:
This PR migrates the Spark SDK from constructing CRDs using raw dictionaries to using the typed Pydantic models provided by
kubeflow_spark_api. There are no user-facing API changes in this PR.What changed:
kubeflow_spark_api.from_dict()instead of manual extractionDriver,Executor,SparkConnectInfo) unchangedWhy:
Testing:
Tested against
kubeflow_spark_api==2.4.0rc0.Which issue(s) this PR fixes (optional, in
Fixes #<issue number>, #<issue number>, ...format, will close the issue(s) when PR gets merged):Fixes #271
Checklist: