-
Notifications
You must be signed in to change notification settings - Fork 292
DPTP-4497: abort-all command actually aborts underlying job runs for the aggregator #4873
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Pipeline controller notification For optional jobs, comment This repository is configured in: automatic mode |
|
@smg247: This pull request references DPTP-4497 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
WalkthroughCentralized ProwJob constants in the public api package and extended the server abort flow to support aggregator ProwJobs by adding per-job abort logic, aggregator detection, and aborting aggregated jobs; updated tests to cover aggregated-abort scenarios. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
cmd/payload-testing-prow-plugin/server.go (1)
633-636: Minor: Consider singular/plural handling for user messages.Similar to the test expectations, the success message uses "jobs" regardless of count. For better user experience, consider handling singular vs plural forms:
jobWord := "jobs" if totalJobsAborted == 1 { jobWord = "job" } return fmt.Sprintf("aborted %d active payload %s for pull request %s/%s#%d", totalJobsAborted, jobWord, org, repo, prNumber)This is a minor UX improvement and doesn't affect functionality.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
📒 Files selected for processing (8)
cmd/payload-testing-prow-plugin/server.go(3 hunks)cmd/payload-testing-prow-plugin/server_test.go(2 hunks)pkg/api/constant.go(1 hunks)pkg/controller/prpqr_reconciler/prpqr_reconciler.go(4 hunks)pkg/jobrunaggregator/jobrunaggregatorlib/jobrun_locator_per_pr_jobs.go(3 hunks)pkg/jobrunaggregator/jobrunaggregatorlib/jobrun_locator_release_jobs.go(2 hunks)pkg/jobrunaggregator/jobrunaggregatorlib/util.go(0 hunks)pkg/jobrunaggregator/jobruntestcaseanalyzer/analyzer.go(3 hunks)
💤 Files with no reviewable changes (1)
- pkg/jobrunaggregator/jobrunaggregatorlib/util.go
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.
Files:
pkg/api/constant.gocmd/payload-testing-prow-plugin/server.gocmd/payload-testing-prow-plugin/server_test.gopkg/controller/prpqr_reconciler/prpqr_reconciler.gopkg/jobrunaggregator/jobruntestcaseanalyzer/analyzer.gopkg/jobrunaggregator/jobrunaggregatorlib/jobrun_locator_release_jobs.gopkg/jobrunaggregator/jobrunaggregatorlib/jobrun_locator_per_pr_jobs.go
🧬 Code graph analysis (5)
cmd/payload-testing-prow-plugin/server.go (2)
pkg/api/constant.go (2)
AggregationIDLabel(85-85)ProwJobJobNameAnnotation(89-89)pkg/release/config/config.go (1)
Job(53-60)
pkg/controller/prpqr_reconciler/prpqr_reconciler.go (2)
pkg/api/constant.go (1)
AggregationIDLabel(85-85)pkg/api/pullrequestpayloadqualification/v1/types.go (1)
PullRequestPayloadQualificationRunLabel(13-13)
pkg/jobrunaggregator/jobruntestcaseanalyzer/analyzer.go (1)
pkg/api/constant.go (1)
ProwJobJobNameAnnotation(89-89)
pkg/jobrunaggregator/jobrunaggregatorlib/jobrun_locator_release_jobs.go (1)
pkg/api/constant.go (1)
ProwJobJobNameAnnotation(89-89)
pkg/jobrunaggregator/jobrunaggregatorlib/jobrun_locator_per_pr_jobs.go (1)
pkg/api/constant.go (1)
ProwJobJobNameAnnotation(89-89)
🔇 Additional comments (9)
pkg/jobrunaggregator/jobrunaggregatorlib/jobrun_locator_release_jobs.go (1)
9-10: LGTM! Refactoring to use centralized constant.The change correctly updates the code to use the public
api.ProwJobJobNameAnnotationconstant instead of a locally defined one, improving maintainability by consolidating the constant definition in the api package.Also applies to: 26-26
pkg/controller/prpqr_reconciler/prpqr_reconciler.go (1)
407-407: LGTM! Consistent refactoring to centralized constant.All usages of the aggregation ID label have been correctly updated to use
api.AggregationIDLabelfrom the public api package, improving maintainability and consistency across the codebase.Also applies to: 449-449, 783-783, 855-855
pkg/jobrunaggregator/jobrunaggregatorlib/jobrun_locator_per_pr_jobs.go (1)
9-10: LGTM! Refactoring with updated documentation.The change correctly migrates to the centralized
api.ProwJobJobNameAnnotationconstant and updates the inline comment to reflect the new source, maintaining clear documentation.Also applies to: 19-19, 28-28
pkg/jobrunaggregator/jobruntestcaseanalyzer/analyzer.go (1)
20-20: LGTM! Consistent migration to public constant.Both annotation lookups have been correctly updated to use
api.ProwJobJobNameAnnotation, moving away from the internaljobrunaggregatorlibconstant to the centralized public api constant.Also applies to: 99-99, 431-431
pkg/api/constant.go (1)
84-89: LGTM! Well-documented centralized constants.The new
AggregationIDLabelandProwJobJobNameAnnotationconstants are properly defined with clear documentation in the public api package, providing a single source of truth for these label/annotation keys across the codebase.cmd/payload-testing-prow-plugin/server_test.go (1)
1056-1147: LGTM! Comprehensive test for aggregator abort functionality.The new test case thoroughly validates the cascading abort behavior for aggregator jobs:
- Correctly sets up an aggregator job with the
api.AggregationIDLabellabel- Creates multiple underlying aggregated jobs with various states
- Verifies that only active jobs (aggregator + 2 active aggregated jobs) are counted, excluding the completed job
- Uses the centralized constants for consistency
The test coverage aligns well with the new functionality in server.go.
cmd/payload-testing-prow-plugin/server.go (3)
528-541: LGTM! Clean helper method with proper error handling.The
abortJobmethod provides good separation of concerns by extracting the abort logic into a reusable helper. It correctly:
- Checks if the job is already complete before attempting abort
- Updates the job state to
AbortedState- Returns both success status and error for proper handling by callers
577-629: LGTM! Robust abort logic for aggregator and regular jobs.The enhanced
abortAlllogic correctly:
- Detects aggregator jobs using the
isAggregatorJobhelper- Lists all aggregated jobs by the
AggregationIDLabel- Aborts each job using the
abortJobhelper, including both aggregator and aggregated jobs- Tracks total jobs aborted across both aggregator and regular job flows
- Properly handles errors and counts only successfully aborted jobs
The cascading abort functionality ensures that when an aggregator job is aborted, all its underlying aggregated jobs are also aborted, which aligns with the PR objectives.
543-561: No actionable changes needed. TheisAggregatorJobmethod's use of the"aggregator-"prefix check is not fragile. This prefix is an established convention throughout the codebase—aggregator jobs are deliberately created with this prefix (seepkg/controller/prpqr_reconciler/prpqr_reconciler.golines 242 and 489), and the pattern is validated by existing tests and used consistently across components including the UI server (which definesaggregatorPrefix = "aggregator-"). The two-level check combining label verification and job name prefix validation is reasonable defensive programming.Likely an incorrect or invalid review comment.
|
/label tide/merge-method-squash |
Co-authored-by: Deep Mistry <dmistry@redhat.com>
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: deepsm007, smg247 The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
cmd/payload-testing-prow-plugin/server.go (1)
607-611: Consider using a clearer error indicator for aggregated job listing failures.The synthetic job name
fmt.Sprintf("%s-aggregated-jobs", jobName)doesn't correspond to an actual ProwJob, which could be confusing when displayed in the error message. Consider using a more descriptive format that clearly indicates this was a listing failure rather than a job abort failure.🔎 Suggested improvement
if err := s.kubeClient.List(s.ctx, &aggregatedJobs, &listOpts); err != nil { jobLogger.WithError(err).Error("failed to list aggregated jobs") - erroredJobs = append(erroredJobs, fmt.Sprintf("%s-aggregated-jobs", jobName)) + erroredJobs = append(erroredJobs, fmt.Sprintf("%s (failed to list aggregated jobs)", jobName)) continue }
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
📒 Files selected for processing (1)
cmd/payload-testing-prow-plugin/server.go(3 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.
Files:
cmd/payload-testing-prow-plugin/server.go
🧬 Code graph analysis (1)
cmd/payload-testing-prow-plugin/server.go (2)
pkg/api/constant.go (2)
AggregationIDLabel(85-85)ProwJobJobNameAnnotation(89-89)pkg/release/config/config.go (1)
Job(53-60)
🔇 Additional comments (3)
cmd/payload-testing-prow-plugin/server.go (3)
528-541: LGTM!Clean extraction of the abort logic into a reusable helper. The return tuple
(bool, error)clearly distinguishes between "already complete" and "failed to abort" scenarios.
543-561: LGTM!The dual validation approach (checking both the aggregation ID label and the "aggregator-" prefix in the job name annotation) provides robust identification of aggregator jobs. Defensive nil checks are appropriate.
596-634: Aggregator abort logic is well-structured.The implementation correctly:
- Detects aggregator jobs using both label and annotation
- Lists all related jobs by aggregation ID
- Skips the aggregator itself to avoid double processing
- Tracks all aborted jobs in the counter
The use of
Infolevel logging for "job was already complete" aligns with the previous review feedback.
|
/retest |
|
/pipeline required |
|
Scheduling required tests: Scheduling tests matching the |
|
/override ci/prow/breaking-changes |
|
@smg247: Overrode contexts on behalf of smg247: ci/prow/breaking-changes DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
/retest-required |
|
/hold Revision f5d6d5f was retested 3 times: holding |
|
/test images |
|
/hold cancel |
|
@smg247: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
This allows us to find job runs that the aggregator is tracking via the label and abort them as well.
I moved some constants around so they could be used without being redefined in the server.
Assisted by: Cursor