Skip to content

Conversation

@kunalmemane
Copy link

DO NOT MERGE - testing a possible bug

@kunalmemane
Copy link
Author

/hold

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 5, 2025
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 5, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: kunalmemane
Once this PR has been reviewed and has the lgtm label, please assign smg247 for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@coderabbitai
Copy link

coderabbitai bot commented Nov 5, 2025

Walkthrough

Added a top-level "DO NOT MERGE - investigating a possible bug" warning to README.md and increased a deliberate pre-create sleep in pkg/steps/source.go from 1 minute to 4 minutes inside the exponential-backoff retry loop used by handleBuild.

Changes

Cohort / File(s) Summary
Documentation notice
README.md
Inserted a blank line and the warning text "DO NOT MERGE - investigating a possible bug" as a top-level notice.
Source build retry change
pkg/steps/source.go
Increased the deliberate sleep before creating a build in handleBuild from 1 minute to 4 minutes within the exponential backoff retry loop; control flow and retry logic otherwise unchanged.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • Review the changed sleep duration in pkg/steps/source.go for unintended side effects on pipeline latency or timeouts.
  • Verify exponential-backoff behavior and interactions with any timeout/cancellation contexts.
  • Confirm the README warning is intended and remove or update when investigation completes.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
README.md (1)

5-6: Ensure proper file formatting before merge.

The warning message is appropriately placed, and the /hold status correctly prevents merging. However, per the diff context, the file appears to be missing a newline at the end. Standard practice is to include a trailing newline at end-of-file.

Apply this diff to add a trailing newline:

-DO NOT MERGE - investigating a possible bug. 
+DO NOT MERGE - investigating a possible bug.

Additionally, once the underlying issue is resolved, remember to remove this temporary warning message and the preceding blank line before merging.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between e6cf944 and 3903ad3.

📒 Files selected for processing (1)
  • README.md (1 hunks)

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 3903ad3 and 411d5a6.

📒 Files selected for processing (1)
  • pkg/steps/source.go (1 hunks)

Comment on lines 555 to 558
if err := wait.ExponentialBackoff(wait.Backoff{Duration: time.Minute, Factor: 1.5, Steps: attempts}, func() (bool, error) {
var attempt buildapi.Build
// builds are using older src image, adding wait to avoid race condition
time.Sleep(1 * time.Minute)
time.Sleep(time.Minute * 4)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Unconditional 4-minute sleep on every build attempt significantly impacts performance.

This sleep delays every build attempt—including the first one—by 4 minutes, potentially adding up to 20 minutes across all 5 retry attempts. Combined with the exponential backoff already configured (base 1 minute, factor 1.5), this creates substantial overhead for all builds, not just those experiencing race conditions.

Consider these alternatives:

  1. Verify source image readiness before build creation instead of sleeping:
 	if err := wait.ExponentialBackoff(wait.Backoff{Duration: time.Minute, Factor: 1.5, Steps: attempts}, func() (bool, error) {
 		var attempt buildapi.Build
-		// builds are using older src image, adding wait to avoid race condition
-		time.Sleep(time.Minute * 4)
+		// Verify source image is available before creating build
+		if err := verifySourceImageReady(ctx, client, build.Spec.CommonSpec.Strategy.DockerStrategy.From); err != nil {
+			return false, fmt.Errorf("source image not ready: %w", err)
+		}
 		build.DeepCopyInto(&attempt)
  1. Apply sleep only on retries (not the first attempt):
 	if err := wait.ExponentialBackoff(wait.Backoff{Duration: time.Minute, Factor: 1.5, Steps: attempts}, func() (bool, error) {
 		var attempt buildapi.Build
-		// builds are using older src image, adding wait to avoid race condition
-		time.Sleep(time.Minute * 4)
+		// On retry, wait for older src image to be ready to avoid race condition
+		if len(errs) > 0 {
+			time.Sleep(time.Minute * 4)
+		}
 		build.DeepCopyInto(&attempt)
  1. Watch the ImageStreamTag to confirm the source image is ready before proceeding.

Given the "DO NOT MERGE" label, if you're investigating whether this sleep resolves a specific race condition, consider gathering data on whether the issue occurs on first attempt or only on retries. This will help determine the root cause and proper fix.

Would you like me to generate a script to search for image readiness verification patterns used elsewhere in the codebase, or help implement a proper synchronization mechanism?

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In pkg/steps/source.go around lines 555-558, remove the unconditional 4-minute
time.Sleep which delays every build attempt; instead either (A) only delay on
retries by checking the attempt counter (i.e., skip the sleep when attempt==0
and apply a smaller backoff on subsequent attempts), or preferably (B) replace
the sleep with an active readiness check that watches the ImageStreamTag (or
queries the image stream) and returns immediately once the source image/tag is
present and ready; implement the chosen approach by removing the hard sleep,
adding a conditional (attempt>0) short sleep if you want minimal change, or by
invoking an ImageStreamTag readiness helper that polls/watches with a short
timeout before proceeding to build creation.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 10, 2025

@kunalmemane: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/breaking-changes 1c94115 link false /test breaking-changes

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant