Skip to content

Conversation

@openshift-cherrypick-robot

This is an automated cherry-pick of #10214

/assign tthvo

…achines

Control plane machines were intermittently being created in different
availability zones than specified in their machine specs. This occurred
because the zone list returned from FilterZonesBasedOnInstanceType used
a set's UnsortedList() func, which has a non-deterministic order.

When CAPI and MAPI manifest generation independently called this func,
they could receive zones in different orders, causing a mismatch in
machine zone placements between CAPI and MAPI manifests.

This commit ensures that we sort the zone slices before further
processing.
@openshift-ci-robot
Copy link
Contributor

@openshift-cherrypick-robot: Jira Issue OCPBUGS-73773 has been cloned as Jira Issue OCPBUGS-73785. Will retitle bug to link to clone.
/retitle [release-4.20] OCPBUGS-73785: ensure deterministic zone ordering for control plane machines

Details

In response to this:

This is an automated cherry-pick of #10214

/assign tthvo

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot changed the title [release-4.20] OCPBUGS-73773: ensure deterministic zone ordering for control plane machines [release-4.20] OCPBUGS-73785: ensure deterministic zone ordering for control plane machines Jan 14, 2026
@openshift-ci-robot openshift-ci-robot added jira/severity-moderate Referenced Jira bug's severity is moderate for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Jan 14, 2026
@openshift-ci-robot
Copy link
Contributor

@openshift-cherrypick-robot: This pull request references Jira Issue OCPBUGS-73785, which is invalid:

  • expected dependent Jira Issue OCPBUGS-73773 to be in one of the following states: VERIFIED, RELEASE PENDING, CLOSED (ERRATA), CLOSED (CURRENT RELEASE), CLOSED (DONE), CLOSED (DONE-ERRATA), but it is MODIFIED instead

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

This is an automated cherry-pick of #10214

/assign tthvo

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link
Member

@tthvo tthvo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jan 14, 2026
@tthvo
Copy link
Member

tthvo commented Jan 14, 2026

/cc @patrickdillon @liweinan

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jan 14, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: tthvo

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 14, 2026
@tthvo
Copy link
Member

tthvo commented Jan 14, 2026

/test e2e-aws-default-config

@gpei
Copy link
Contributor

gpei commented Jan 14, 2026

/jira refresh

@openshift-ci-robot
Copy link
Contributor

@gpei: This pull request references Jira Issue OCPBUGS-73785, which is invalid:

  • expected dependent Jira Issue OCPBUGS-73773 to be in one of the following states: VERIFIED, RELEASE PENDING, CLOSED (ERRATA), CLOSED (CURRENT RELEASE), CLOSED (DONE), CLOSED (DONE-ERRATA), but it is MODIFIED instead

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

Details

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jan 14, 2026

@openshift-cherrypick-robot: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@patrickdillon
Copy link
Contributor

/label backport-risk-assessed

@openshift-ci openshift-ci bot added the backport-risk-assessed Indicates a PR to a release branch has been evaluated and considered safe to accept. label Jan 14, 2026
@liweinan
Copy link

I'll verify this today.

@liweinan
Copy link

Verified with the build: https://prow.ci.openshift.org/view/gs/test-platform-results/logs/release-openshift-origin-installer-launch-aws-modern/2011655963467059200

Verification scripts: https://github.com/liweinan/my-openshift-workspace/tree/main/OCPBUGS-69923

anan@think:~/works/openshift-versions/420beta1$ ./openshift-install version
./openshift-install 4.20.0-0-2026-01-15-051152-test-ci-ln-hbwt5p2-latest
built from commit f4546b5cc38eae4d32d3314f2be0e78e4b4e4b00
release image registry.build10.ci.openshift.org/ci-ln-hbwt5p2/release@sha256:671a40f2ac49625dc85b0dfd9671ac11216b06c939bfed6a779bc66c714845a9
release architecture amd64
anan@think:~/works/openshift-versions/420beta1$ ./openshift-install create install-config
? SSH Public Key /home/anan/.ssh/id_rsa.pub
? Platform aws
INFO Credentials loaded from the AWS config using "SharedConfigCredentials: /home/anan/.aws/credentials" provider 
INFO Credentials loaded from the "default" profile in file "/home/anan/.aws/credentials" 
? Region us-east-1
? Base Domain qe.devcluster.openshift.com
? Cluster Name weli-test
? Pull Secret [? for help] **********************************************************************************************INFO Install-Config created in: .                 
anan@think:~/works/openshift-versions/420beta1$ ls
install-config.yaml  openshift-install
anan@think:~/works/openshift-versions/420beta1$ cp install-config.yaml install-config.yaml.bkup
anan@think:~/works/openshift-versions/420beta1$ ls
install-config.yaml  install-config.yaml.bkup  openshift-install
anan@think:~/works/openshift-versions/420beta1$ ./openshift-install create manifests
INFO Credentials loaded from the "default" profile in file "/home/anan/.aws/credentials" 
INFO Credentials loaded from the AWS config using "SharedConfigCredentials: /home/anan/.aws/credentials" provider 
INFO Consuming Install Config from target directory 
INFO Successfully populated MCS CA cert information: root-ca 2036-01-13T08:02:41Z 2026-01-15T08:02:41Z 
INFO Successfully populated MCS TLS cert information: root-ca 2036-01-13T08:02:41Z 2026-01-15T08:02:41Z 
INFO Adding clusters...                           
INFO Manifests created in: cluster-api, manifests and openshift 
anan@think:~/works/openshift-versions/420beta1$ ../../my-openshift-workspace/OCPBUGS-69923/verify-manifests.sh 
==========================================
Verify CAPI and MAPI Manifest Zone Consistency
==========================================

Installation directory: .

CAPI Machine Zones:
  master-0 (99_openshift-cluster-api_master-machines-0.yaml): us-east-1a
  master-1 (99_openshift-cluster-api_master-machines-1.yaml): us-east-1b
  master-2 (99_openshift-cluster-api_master-machines-2.yaml): us-east-1c

MAPI Machine Zones:
  master-0 (from 99_openshift-machine-api_master-control-plane-machine-set.yaml): us-east-1a
  master-1 (from 99_openshift-machine-api_master-control-plane-machine-set.yaml): us-east-1b
  master-2 (from 99_openshift-machine-api_master-control-plane-machine-set.yaml): us-east-1c

==========================================
Consistency Check
==========================================
✓ Match: master-0 - Zone: us-east-1a
✓ Match: master-1 - Zone: us-east-1b
✓ Match: master-2 - Zone: us-east-1c

✅ Verification PASSED: All machines have consistent zone allocation!
anan@think:~/works/openshift-versions/420beta1$ ../../my-openshift-workspace/OCPBUGS-69923/verify-cluster.sh 
==========================================
Verify Machine Zone Consistency in Cluster
==========================================

Kubeconfig: /home/anan/works/openshift-versions/420beta1/auth/kubeconfig

✓ Successfully connected to cluster

Found 3 master machine(s)

==========================================
Check Zone Consistency for Each Machine
==========================================

--- Machine: weli-test-q7mkr-master-0 ---
  Zone Label:        us-east-1a
  ProviderID Zone:  us-east-1a
  Spec Zone:        us-east-1a
  Subnet Filter:     weli-test-q7mkr-subnet-private-us-east-1a
  ✅ Zone consistent

--- Machine: weli-test-q7mkr-master-1 ---
  Zone Label:        us-east-1b
  ProviderID Zone:  us-east-1b
  Spec Zone:        us-east-1b
  Subnet Filter:     weli-test-q7mkr-subnet-private-us-east-1b
  ✅ Zone consistent

--- Machine: weli-test-q7mkr-master-2 ---
  Zone Label:        us-east-1c
  ProviderID Zone:  us-east-1c
  Spec Zone:        us-east-1c
  Subnet Filter:     weli-test-q7mkr-subnet-private-us-east-1c
  ✅ Zone consistent

==========================================
Verification Summary
==========================================

Checked 3 master machine(s)

✅ Verification PASSED: All machines have consistent zones!

Cluster verification: PASS ✓

Fix verification successful:
  - Zone label, ProviderID zone, and Spec zone are all consistent
  - Machines are created in the correct availability zones

@liweinan
Copy link

/verified by liweinan

@openshift-ci-robot openshift-ci-robot added the verified Signifies that the PR passed pre-merge verification criteria label Jan 15, 2026
@openshift-ci-robot
Copy link
Contributor

@liweinan: This PR has been marked as verified by liweinan.

Details

In response to this:

/verified by liweinan

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@tthvo
Copy link
Member

tthvo commented Jan 16, 2026

/jira refresh

@openshift-ci-robot openshift-ci-robot added jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. and removed jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Jan 16, 2026
@openshift-ci-robot
Copy link
Contributor

@tthvo: This pull request references Jira Issue OCPBUGS-73785, which is valid.

7 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.20.z) matches configured target version for branch (4.20.z)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)
  • release note text is set and does not match the template
  • dependent bug Jira Issue OCPBUGS-73773 is in the state Verified, which is one of the valid states (VERIFIED, RELEASE PENDING, CLOSED (ERRATA), CLOSED (CURRENT RELEASE), CLOSED (DONE), CLOSED (DONE-ERRATA))
  • dependent Jira Issue OCPBUGS-73773 targets the "4.21.0" version, which is one of the valid target versions: 4.21.0
  • bug has dependents

Requesting review from QA contact:
/cc @liweinan

Details

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@tthvo
Copy link
Member

tthvo commented Jan 16, 2026

/tide refresh

@tthvo
Copy link
Member

tthvo commented Jan 16, 2026

/cherry-pick release-4.19

@openshift-cherrypick-robot
Copy link
Author

@tthvo: once the present PR merges, I will cherry-pick it on top of release-4.19 in a new PR and assign it to you.

Details

In response to this:

/cherry-pick release-4.19

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-merge-bot openshift-merge-bot bot merged commit 289d016 into openshift:release-4.20 Jan 16, 2026
15 checks passed
@openshift-ci-robot
Copy link
Contributor

@openshift-cherrypick-robot: Jira Issue Verification Checks: Jira Issue OCPBUGS-73785
✔️ This pull request was pre-merge verified.
✔️ All associated pull requests have merged.
✔️ All associated, merged pull requests were pre-merge verified.

Jira Issue OCPBUGS-73785 has been moved to the MODIFIED state and will move to the VERIFIED state when the change is available in an accepted nightly payload. 🕓

Details

In response to this:

This is an automated cherry-pick of #10214

/assign tthvo

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-cherrypick-robot
Copy link
Author

@tthvo: new pull request created: #10230

Details

In response to this:

/cherry-pick release-4.19

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-merge-robot
Copy link
Contributor

Fix included in accepted release 4.20.0-0.nightly-2026-01-17-203204

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. backport-risk-assessed Indicates a PR to a release branch has been evaluated and considered safe to accept. jira/severity-moderate Referenced Jira bug's severity is moderate for the branch this PR is targeting. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. verified Signifies that the PR passed pre-merge verification criteria

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants