feat: wait for pod to be running when follow=True in get_job_logs#183
Open
aniketpati1121 wants to merge 2 commits intokubeflow:mainfrom
Open
feat: wait for pod to be running when follow=True in get_job_logs#183aniketpati1121 wants to merge 2 commits intokubeflow:mainfrom
aniketpati1121 wants to merge 2 commits intokubeflow:mainfrom
Conversation
Signed-off-by: Aniket Patil <aniketpatil2027@gmail.com>
Contributor
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
Signed-off-by: Aniket Patil <aniketpatil2027@gmail.com>
Pull Request Test Coverage Report for Build 19798151671Details
💛 - Coveralls |
Contributor
Author
|
Hi @szaher @kramaranya |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR implements the behavior requested in issue #182.
Previously, trainer.get_job_logs(job_id, follow=True) exited immediately if the pod did not yet exist or was still pending. This made it difficult for users to follow logs immediately after submitting a job, because pods are usually created asynchronously.
What this PR adds
When follow=True, the backend now waits for the pod to be created and to leave the Pending state.
Added a simple polling loop with:
timeout: 120 seconds
poll interval: 2 seconds
Preserves old behavior for follow=False, returning immediately if no pod exists.
No API changes, fully backward compatible.
Why this is needed
Users commonly want to follow logs right after submitting a TrainingJob.
With the previous behavior, they needed to implement custom waiting logic.
This PR aligns the trainer experience with typical Kubernetes log-following behavior.
Testing
All existing tests pass (162 passed).
No breaking changes.
Local manual tests done.
Fixes #182.