Skip to content

Conversation

@smg247
Copy link
Member

@smg247 smg247 commented Dec 18, 2025

This allows us to find job runs that the aggregator is tracking via the label and abort them as well.

I moved some constants around so they could be used without being redefined in the server.

Assisted by: Cursor

@openshift-ci-robot
Copy link
Contributor

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: automatic mode

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Dec 18, 2025
@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Dec 18, 2025

@smg247: This pull request references DPTP-4497 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "4.22.0" version, but no target version was set.

Details

In response to this:

This allows us to find job runs that the aggregator is tracking via the label and abort them as well.

I moved some constants around so they could be used without being redefined in the server.

Assisted by: Cursor

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai
Copy link

coderabbitai bot commented Dec 18, 2025

Walkthrough

Centralized ProwJob constants in the public api package and extended the server abort flow to support aggregator ProwJobs by adding per-job abort logic, aggregator detection, and aborting aggregated jobs; updated tests to cover aggregated-abort scenarios.

Changes

Cohort / File(s) Summary
Public API Constants
pkg/api/constant.go
Added exported constants AggregationIDLabel ("release.openshift.io/aggregation-id") and ProwJobJobNameAnnotation ("prow.k8s.io/job").
Server abort logic
cmd/payload-testing-prow-plugin/server.go
Added func (s *server) abortJob(job *prowapi.ProwJob, logger *logrus.Entry) (bool, error) and func isAggregatorJob(job *prowapi.ProwJob) (string, bool); refactored abortAll to use abortJob and to list+abort aggregated jobs when encountering an aggregator; updated abort reporting to include total aborted counts.
Server tests
cmd/payload-testing-prow-plugin/server_test.go
Updated abort-test expectations to report aborted-count messages and added a test case verifying aborting an aggregator plus its aggregated jobs (multiple underlying jobs) produces the correct aborted counts and messages.
Constant reference migration
pkg/controller/prpqr_reconciler/prpqr_reconciler.go, pkg/jobrunaggregator/jobrunaggregatorlib/jobrun_locator_per_pr_jobs.go, pkg/jobrunaggregator/jobrunaggregatorlib/jobrun_locator_release_jobs.go, pkg/jobrunaggregator/jobruntestcaseanalyzer/analyzer.go
Replaced local/internal uses of aggregation and prow-job annotation keys with api.AggregationIDLabel and api.ProwJobJobNameAnnotation.
Removed duplicate local constant
pkg/jobrunaggregator/jobrunaggregatorlib/util.go
Removed the exported ProwJobJobNameAnnotation constant from the jobrunaggregatorlib package (now provided by pkg/api).

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Verify abortJob updates ProwJob state and persists via kubeClient correctly and idempotently.
  • Check isAggregatorJob identification logic (label + annotation) and label selector used to list aggregated jobs.
  • Confirm abortAll correctly aggregates counts/messages for aggregated and non-aggregated flows.
  • Review added test for realistic setup and sufficient assertions covering multiple aggregated-job states.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci bot requested review from deads2k and liangxia December 18, 2025 17:54
@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 18, 2025
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
cmd/payload-testing-prow-plugin/server.go (1)

633-636: Minor: Consider singular/plural handling for user messages.

Similar to the test expectations, the success message uses "jobs" regardless of count. For better user experience, consider handling singular vs plural forms:

jobWord := "jobs"
if totalJobsAborted == 1 {
    jobWord = "job"
}
return fmt.Sprintf("aborted %d active payload %s for pull request %s/%s#%d", 
    totalJobsAborted, jobWord, org, repo, prNumber)

This is a minor UX improvement and doesn't affect functionality.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 105b667 and 3a2e0bd.

📒 Files selected for processing (8)
  • cmd/payload-testing-prow-plugin/server.go (3 hunks)
  • cmd/payload-testing-prow-plugin/server_test.go (2 hunks)
  • pkg/api/constant.go (1 hunks)
  • pkg/controller/prpqr_reconciler/prpqr_reconciler.go (4 hunks)
  • pkg/jobrunaggregator/jobrunaggregatorlib/jobrun_locator_per_pr_jobs.go (3 hunks)
  • pkg/jobrunaggregator/jobrunaggregatorlib/jobrun_locator_release_jobs.go (2 hunks)
  • pkg/jobrunaggregator/jobrunaggregatorlib/util.go (0 hunks)
  • pkg/jobrunaggregator/jobruntestcaseanalyzer/analyzer.go (3 hunks)
💤 Files with no reviewable changes (1)
  • pkg/jobrunaggregator/jobrunaggregatorlib/util.go
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.

Files:

  • pkg/api/constant.go
  • cmd/payload-testing-prow-plugin/server.go
  • cmd/payload-testing-prow-plugin/server_test.go
  • pkg/controller/prpqr_reconciler/prpqr_reconciler.go
  • pkg/jobrunaggregator/jobruntestcaseanalyzer/analyzer.go
  • pkg/jobrunaggregator/jobrunaggregatorlib/jobrun_locator_release_jobs.go
  • pkg/jobrunaggregator/jobrunaggregatorlib/jobrun_locator_per_pr_jobs.go
🧬 Code graph analysis (5)
cmd/payload-testing-prow-plugin/server.go (2)
pkg/api/constant.go (2)
  • AggregationIDLabel (85-85)
  • ProwJobJobNameAnnotation (89-89)
pkg/release/config/config.go (1)
  • Job (53-60)
pkg/controller/prpqr_reconciler/prpqr_reconciler.go (2)
pkg/api/constant.go (1)
  • AggregationIDLabel (85-85)
pkg/api/pullrequestpayloadqualification/v1/types.go (1)
  • PullRequestPayloadQualificationRunLabel (13-13)
pkg/jobrunaggregator/jobruntestcaseanalyzer/analyzer.go (1)
pkg/api/constant.go (1)
  • ProwJobJobNameAnnotation (89-89)
pkg/jobrunaggregator/jobrunaggregatorlib/jobrun_locator_release_jobs.go (1)
pkg/api/constant.go (1)
  • ProwJobJobNameAnnotation (89-89)
pkg/jobrunaggregator/jobrunaggregatorlib/jobrun_locator_per_pr_jobs.go (1)
pkg/api/constant.go (1)
  • ProwJobJobNameAnnotation (89-89)
🔇 Additional comments (9)
pkg/jobrunaggregator/jobrunaggregatorlib/jobrun_locator_release_jobs.go (1)

9-10: LGTM! Refactoring to use centralized constant.

The change correctly updates the code to use the public api.ProwJobJobNameAnnotation constant instead of a locally defined one, improving maintainability by consolidating the constant definition in the api package.

Also applies to: 26-26

pkg/controller/prpqr_reconciler/prpqr_reconciler.go (1)

407-407: LGTM! Consistent refactoring to centralized constant.

All usages of the aggregation ID label have been correctly updated to use api.AggregationIDLabel from the public api package, improving maintainability and consistency across the codebase.

Also applies to: 449-449, 783-783, 855-855

pkg/jobrunaggregator/jobrunaggregatorlib/jobrun_locator_per_pr_jobs.go (1)

9-10: LGTM! Refactoring with updated documentation.

The change correctly migrates to the centralized api.ProwJobJobNameAnnotation constant and updates the inline comment to reflect the new source, maintaining clear documentation.

Also applies to: 19-19, 28-28

pkg/jobrunaggregator/jobruntestcaseanalyzer/analyzer.go (1)

20-20: LGTM! Consistent migration to public constant.

Both annotation lookups have been correctly updated to use api.ProwJobJobNameAnnotation, moving away from the internal jobrunaggregatorlib constant to the centralized public api constant.

Also applies to: 99-99, 431-431

pkg/api/constant.go (1)

84-89: LGTM! Well-documented centralized constants.

The new AggregationIDLabel and ProwJobJobNameAnnotation constants are properly defined with clear documentation in the public api package, providing a single source of truth for these label/annotation keys across the codebase.

cmd/payload-testing-prow-plugin/server_test.go (1)

1056-1147: LGTM! Comprehensive test for aggregator abort functionality.

The new test case thoroughly validates the cascading abort behavior for aggregator jobs:

  • Correctly sets up an aggregator job with the api.AggregationIDLabel label
  • Creates multiple underlying aggregated jobs with various states
  • Verifies that only active jobs (aggregator + 2 active aggregated jobs) are counted, excluding the completed job
  • Uses the centralized constants for consistency

The test coverage aligns well with the new functionality in server.go.

cmd/payload-testing-prow-plugin/server.go (3)

528-541: LGTM! Clean helper method with proper error handling.

The abortJob method provides good separation of concerns by extracting the abort logic into a reusable helper. It correctly:

  • Checks if the job is already complete before attempting abort
  • Updates the job state to AbortedState
  • Returns both success status and error for proper handling by callers

577-629: LGTM! Robust abort logic for aggregator and regular jobs.

The enhanced abortAll logic correctly:

  • Detects aggregator jobs using the isAggregatorJob helper
  • Lists all aggregated jobs by the AggregationIDLabel
  • Aborts each job using the abortJob helper, including both aggregator and aggregated jobs
  • Tracks total jobs aborted across both aggregator and regular job flows
  • Properly handles errors and counts only successfully aborted jobs

The cascading abort functionality ensures that when an aggregator job is aborted, all its underlying aggregated jobs are also aborted, which aligns with the PR objectives.


543-561: No actionable changes needed. The isAggregatorJob method's use of the "aggregator-" prefix check is not fragile. This prefix is an established convention throughout the codebase—aggregator jobs are deliberately created with this prefix (see pkg/controller/prpqr_reconciler/prpqr_reconciler.go lines 242 and 489), and the pattern is validated by existing tests and used consistently across components including the UI server (which defines aggregatorPrefix = "aggregator-"). The two-level check combining label verification and job name prefix validation is reasonable defensive programming.

Likely an incorrect or invalid review comment.

@smg247
Copy link
Member Author

smg247 commented Dec 18, 2025

/label tide/merge-method-squash

@openshift-ci openshift-ci bot added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Dec 18, 2025
Co-authored-by: Deep Mistry <dmistry@redhat.com>
@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Dec 18, 2025
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 18, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: deepsm007, smg247

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
cmd/payload-testing-prow-plugin/server.go (1)

607-611: Consider using a clearer error indicator for aggregated job listing failures.

The synthetic job name fmt.Sprintf("%s-aggregated-jobs", jobName) doesn't correspond to an actual ProwJob, which could be confusing when displayed in the error message. Consider using a more descriptive format that clearly indicates this was a listing failure rather than a job abort failure.

🔎 Suggested improvement
 			if err := s.kubeClient.List(s.ctx, &aggregatedJobs, &listOpts); err != nil {
 				jobLogger.WithError(err).Error("failed to list aggregated jobs")
-				erroredJobs = append(erroredJobs, fmt.Sprintf("%s-aggregated-jobs", jobName))
+				erroredJobs = append(erroredJobs, fmt.Sprintf("%s (failed to list aggregated jobs)", jobName))
 				continue
 			}
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 5f7053f and f5d6d5f.

📒 Files selected for processing (1)
  • cmd/payload-testing-prow-plugin/server.go (3 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.

Files:

  • cmd/payload-testing-prow-plugin/server.go
🧬 Code graph analysis (1)
cmd/payload-testing-prow-plugin/server.go (2)
pkg/api/constant.go (2)
  • AggregationIDLabel (85-85)
  • ProwJobJobNameAnnotation (89-89)
pkg/release/config/config.go (1)
  • Job (53-60)
🔇 Additional comments (3)
cmd/payload-testing-prow-plugin/server.go (3)

528-541: LGTM!

Clean extraction of the abort logic into a reusable helper. The return tuple (bool, error) clearly distinguishes between "already complete" and "failed to abort" scenarios.


543-561: LGTM!

The dual validation approach (checking both the aggregation ID label and the "aggregator-" prefix in the job name annotation) provides robust identification of aggregator jobs. Defensive nil checks are appropriate.


596-634: Aggregator abort logic is well-structured.

The implementation correctly:

  • Detects aggregator jobs using both label and annotation
  • Lists all related jobs by aggregation ID
  • Skips the aggregator itself to avoid double processing
  • Tracks all aborted jobs in the counter

The use of Info level logging for "job was already complete" aligns with the previous review feedback.

@openshift-ci-robot
Copy link
Contributor

/retest-required

Remaining retests: 0 against base HEAD 105b667 and 2 for PR HEAD f5d6d5f in total

@openshift-ci-robot
Copy link
Contributor

/retest-required

Remaining retests: 0 against base HEAD 9dfdb9e and 1 for PR HEAD f5d6d5f in total

@smg247
Copy link
Member Author

smg247 commented Dec 22, 2025

/retest

@smg247
Copy link
Member Author

smg247 commented Dec 22, 2025

/pipeline required

@openshift-ci-robot
Copy link
Contributor

Scheduling required tests:
/test e2e

Scheduling tests matching the pipeline_run_if_changed or not excluded by pipeline_skip_if_only_changed parameters:
/test integration-optional-test

@smg247
Copy link
Member Author

smg247 commented Dec 22, 2025

/override ci/prow/breaking-changes

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 22, 2025

@smg247: Overrode contexts on behalf of smg247: ci/prow/breaking-changes

Details

In response to this:

/override ci/prow/breaking-changes

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@smg247
Copy link
Member Author

smg247 commented Dec 22, 2025

/retest-required

@openshift-ci-robot
Copy link
Contributor

/retest-required

Remaining retests: 0 against base HEAD 4a066ea and 0 for PR HEAD f5d6d5f in total

@openshift-ci-robot
Copy link
Contributor

/hold

Revision f5d6d5f was retested 3 times: holding

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 22, 2025
@smg247
Copy link
Member Author

smg247 commented Dec 22, 2025

/test images

@smg247
Copy link
Member Author

smg247 commented Dec 22, 2025

/hold cancel

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 22, 2025
@openshift-ci-robot
Copy link
Contributor

/retest-required

Remaining retests: 0 against base HEAD 4a066ea and 2 for PR HEAD f5d6d5f in total

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 22, 2025

@smg247: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/images f5d6d5f link true /test images

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants