Skip to content

Conversation

@sergenyalcin
Copy link
Member

@sergenyalcin sergenyalcin commented Jan 14, 2026

Description of your changes

This PR adds storage version migration support to upjet, enabling automatic migration of CRD resources from old storage versions to new storage versions when CRD schemas are updated.

When a CRD's storage version changes (e.g., from v1beta1 to v1beta2), existing resources stored in etcd remain in the old version until they are explicitly migrated. This can cause issues during upgrades, especially when introducing breaking changes.

The storage version migrator automates this migration process by:

  1. Listing all resources of a CRD
  2. Applying an empty patch to each resource (triggering the API server to convert and rewrite them in the new storage version)
  3. Updating the CRD's status.storedVersions to reflect only the new storage version

Core Implementation (pkg/config/crds_migrator.go)

  • CRDsMigrator: Handles storage version migration for a list of GroupVersionKinds
  • Run(): Executes the migration process with pagination support (500 resources per batch)
  • GetCRDNameFromGVK(): Resolves CRD names from GroupVersionKind using discovery client
  • PrepareCRDsMigrator(): Scans provider resources and creates a migrator for those with previous versions

Provider Integration (pkg/config/provider.go)

  • Added StorageVersionMigrator field to the Provider struct
  • Added WithStorageVersionMigrator() provider option

Example Usage

In provider configuration (config/provider.go):

func GetProvider(ctx context.Context, sdkProvider *schema.Provider) (*ujconfig.Provider, error) {
    pc := ujconfig.NewProvider(...)

    // Configure resources
    pc.ConfigureResources()

    // Prepare storage version migrator for resources with previous versions
    ujconfig.PrepareCRDsMigrator(pc)

    return pc, nil
}

In provider main (cmd/provider/main.go):

// After setting up controllers, run storage version migration
discoveryClient, err := discovery.NewDiscoveryClientForConfig(cfg)
if err != nil {
	logr.Info("Failed to create discovery client, skipping the storage version migration", "err", err)
} else {
	// Create a non-cached client for the migration since the manager cache hasn't started yet
	directClient, err := client.New(cfg, client.Options{Scheme: mgr.GetScheme()})
	if err != nil {
		logr.Info("Failed to create direct client for storage version migration", "err", err)
	} else {
		if err = clusterProvider.StorageVersionMigrator.Run(ctx, logr, discoveryClient, directClient); err != nil {
			logr.Info("Failed to run storage version migrator", "err", err)
		}
	}
}
kingpin.FatalIfError(mgr.Start(ctrl.SetupSignalHandler()), "Cannot start controller manager")

I have:

  • Read and followed Upjet's contribution process.
  • Run make reviewable to ensure this PR is ready for review.
  • Added backport release-x.y labels to auto-backport this PR if necessary.

How has this code been tested

Tested in this branch: https://github.com/sergenyalcin/provider-upjet-azuread/tree/sv-migrator. Validated the migration.

Summary by CodeRabbit

  • New Features

    • Added automatic CRD storage-version migration with batched conversion and verification to ensure custom resources are migrated cleanly.
  • Chores

    • Updated Kubernetes API dependency declarations to align with current requirements.

✏️ Tip: You can customize this high-level summary in your review settings.

Signed-off-by: Sergen Yalçın <yalcinsergen97@gmail.com>
@coderabbitai
Copy link

coderabbitai bot commented Jan 14, 2026

📝 Walkthrough

Walkthrough

Adds a CRD storage-version migration facility (CRDsMigrator) with discovery- and client-driven migration logic, integrates it into Provider via a new option, and promotes k8s.io/apiextensions-apiserver to a direct go.mod requirement.

Changes

Cohort / File(s) Summary
Dependency Management
go.mod
Promoted k8s.io/apiextensions-apiserver v0.33.0 from indirect to a direct require in the primary require block (+1/-1).
CRD Storage-Version Migration
pkg/config/crds_migrator.go
Added CRDsMigrator type, NewCRDsMigrator, Run(ctx, logr, discoveryClient, kube), GetCRDNameFromGVK, and PrepareCRDsMigrator. Implements discovery-based CRD name resolution, detects storage version, batch-lists resources, applies empty patches to trigger conversion, and updates CRD.Status.StoredVersions.
Provider Integration
pkg/config/provider.go
Added StorageVersionMigrator *CRDsMigrator field to Provider and WithStorageVersionMigrator(migrator *CRDsMigrator) ProviderOption.

Sequence Diagram

sequenceDiagram
    actor User
    participant Provider
    participant CRDsMigrator
    participant DiscoveryClient
    participant KubeClient
    participant CRDResource
    participant ResourceObj

    User->>Provider: PrepareCRDsMigrator(pc)
    Provider->>CRDsMigrator: NewCRDsMigrator(gvkList)
    Provider->>Provider: StorageVersionMigrator = migrator

    User->>CRDsMigrator: Run(ctx, logger, discoveryClient, kubeClient)
    loop For each GVK in gvkList
        CRDsMigrator->>DiscoveryClient: GetCRDNameFromGVK(gvk)
        DiscoveryClient-->>CRDsMigrator: crdName

        CRDsMigrator->>KubeClient: Get CRD by name
        KubeClient-->>CRDsMigrator: CRD object

        CRDsMigrator->>CRDsMigrator: Determine storage version

        alt Migration required
            CRDsMigrator->>KubeClient: List resources in storage version (batched)
            KubeClient-->>CRDsMigrator: Resource batch

            loop For each ResourceObj in batch
                CRDsMigrator->>ResourceObj: Apply empty patch (trigger conversion)
                ResourceObj-->>CRDsMigrator: Patched
            end

            CRDsMigrator->>CRDResource: Update CRD.Status.StoredVersions
            CRDResource-->>CRDsMigrator: Status updated
        end
    end
    CRDsMigrator-->>User: Migration complete
Loading

Estimated Code Review Effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐇 Hops for CRDs now shifting guise,

Storage versions find new ties,
Batches patched with patient care,
Discovery guides the migrator's fare,
🥕 A rabbit cheers the cluster's prize!

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Add storage version migration support' accurately summarizes the main change: introducing a new CRDsMigrator facility with supporting Provider integration to enable automated CRD storage version migration.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings


📜 Recent review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3258b35 and 738bbec.

📒 Files selected for processing (1)
  • pkg/config/crds_migrator.go
🧰 Additional context used
📓 Path-based instructions (2)
**/*.go

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.go: Do not use any type throughout codebase - use concrete types or type parameters instead
Use pointer types for optional fields in generated structs
Avoid type aliases in favor of explicit types
Use github.com/pkg/errors for error wrapping with context
Return errors from functions instead of panicking, except for impossible states
Wrap errors with context using patterns like: errors.Wrap(err, "cannot configure resource")
Avoid circular dependencies between packages

Files:

  • pkg/config/crds_migrator.go
pkg/**/*.go

📄 CodeRabbit inference engine (CLAUDE.md)

Public API packages should be organized under pkg/ directory

Files:

  • pkg/config/crds_migrator.go
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: lint
  • GitHub Check: check-diff
  • GitHub Check: unit-tests
🔇 Additional comments (6)
pkg/config/crds_migrator.go (6)

29-41: LGTM!

Clean struct definition and constructor. The comment explaining the code origin from crossplane-runtime's internal package provides good context.


70-94: LGTM!

Good defensive check for empty storage version (line 78-80). The migration need detection logic correctly identifies when stored versions differ from the current storage version.


105-129: LGTM!

The batch processing with pagination (500 resources per batch) and empty patch approach for triggering storage version migration is a well-established pattern. The comment explaining why the empty patch works is helpful.


131-148: Good defensive verification after status update.

The post-update verification (lines 139-146) ensures the CRD status was actually updated as expected. This is good defensive programming, especially for a migration operation where correctness is critical.


152-161: LGTM!

Good refactor to accept meta.RESTMapper instead of discovery.DiscoveryInterface, avoiding expensive repeated discovery calls per GVK.


163-178: LGTM!

The function correctly scans resources with previous versions and constructs the migrator. Note that this will always set StorageVersionMigrator even when no resources have previous versions (empty gvkList), which is fine since Run will no-op in that case.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
pkg/config/crds_migrator.go (3)

11-11: Inconsistent error package usage.

This file uses github.com/crossplane/crossplane-runtime/v2/pkg/errors while provider.go in the same package uses github.com/pkg/errors. Per coding guidelines, prefer github.com/pkg/errors for consistency.

♻️ Suggested change
-	"github.com/crossplane/crossplane-runtime/v2/pkg/errors"
+	"github.com/pkg/errors"

96-116: Consider adding progress logging for large migrations.

For CRDs with thousands of resources, the batch patching loop could run for a long time without feedback. Consider adding periodic progress logging (e.g., after each batch).

♻️ Optional enhancement
+		batchCount := 0
 		for {
 			if err := kube.List(ctx, &resources,
 				client.Limit(500),
 				client.Continue(continueToken),
 			); err != nil {
 				return errors.Wrapf(err, "cannot list %s", resources.GroupVersionKind().String())
 			}

 			for i := range resources.Items {
 				// apply empty patch for storage version upgrade
 				res := resources.Items[i]
 				if err := kube.Patch(ctx, &res, client.RawPatch(types.MergePatchType, []byte(`{}`))); err != nil {
 					return errors.Wrapf(err, "cannot patch %s %q", crd.Spec.Names.Kind, res.GetName())
 				}
 			}
+			batchCount++
+			logr.Debug("Processed batch", "crd", crdName, "batch", batchCount, "resourcesInBatch", len(resources.Items))

 			continueToken = resources.GetContinue()
 			if continueToken == "" {
 				break
 			}
 		}

141-154: Consider wrapping errors with context.

Per coding guidelines, wrap errors with context using patterns like errors.Wrap(err, "context"). The bare error returns on lines 144 and 150 lose context about what operation failed.

♻️ Suggested fix
 func GetCRDNameFromGVK(discoveryClient discovery.DiscoveryInterface, gvk schema.GroupVersionKind) (string, error) {
 	groupResources, err := restmapper.GetAPIGroupResources(discoveryClient)
 	if err != nil {
-		return "", err
+		return "", errors.Wrap(err, "cannot get API group resources")
 	}

 	mapper := restmapper.NewDiscoveryRESTMapper(groupResources)
 	mapping, err := mapper.RESTMapping(gvk.GroupKind(), gvk.Version)
 	if err != nil {
-		return "", err
+		return "", errors.Wrapf(err, "cannot get REST mapping for %s", gvk.String())
 	}

 	return mapping.Resource.Resource + "." + mapping.Resource.Group, nil
 }
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 914f0cb and 3258b35.

📒 Files selected for processing (3)
  • go.mod
  • pkg/config/crds_migrator.go
  • pkg/config/provider.go
🧰 Additional context used
📓 Path-based instructions (3)
**/*.go

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.go: Do not use any type throughout codebase - use concrete types or type parameters instead
Use pointer types for optional fields in generated structs
Avoid type aliases in favor of explicit types
Use github.com/pkg/errors for error wrapping with context
Return errors from functions instead of panicking, except for impossible states
Wrap errors with context using patterns like: errors.Wrap(err, "cannot configure resource")
Avoid circular dependencies between packages

Files:

  • pkg/config/crds_migrator.go
  • pkg/config/provider.go
pkg/**/*.go

📄 CodeRabbit inference engine (CLAUDE.md)

Public API packages should be organized under pkg/ directory

Files:

  • pkg/config/crds_migrator.go
  • pkg/config/provider.go
go.mod

📄 CodeRabbit inference engine (CLAUDE.md)

go.mod: Module path must be github.com/crossplane/upjet/v2
When testing in providers, add replace github.com/crossplane/upjet/v2 => ../upjet to provider's go.mod
Run make modules.check to verify go.mod/go.sum are tidy before committing

Files:

  • go.mod
🧬 Code graph analysis (2)
pkg/config/crds_migrator.go (1)
pkg/config/provider.go (1)
  • Provider (73-186)
pkg/config/provider.go (1)
pkg/config/crds_migrator.go (1)
  • CRDsMigrator (28-30)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: lint
  • GitHub Check: unit-tests
  • GitHub Check: check-diff
🔇 Additional comments (7)
go.mod (1)

39-39: LGTM!

The promotion of k8s.io/apiextensions-apiserver to a direct dependency is appropriate since the new CRDsMigrator directly uses extv1.CustomResourceDefinition. The version aligns correctly with other k8s.io dependencies at v0.33.0.

pkg/config/provider.go (2)

182-186: LGTM!

The StorageVersionMigrator field follows the established pattern for optional Provider fields with pointer type and clear documentation.


315-321: LGTM!

The WithStorageVersionMigrator option follows the consistent pattern of other ProviderOption functions in this file.

pkg/config/crds_migrator.go (4)

27-39: LGTM!

The struct and constructor are clean and follow Go conventions with an unexported field and exported constructor.


118-134: LGTM!

The status update with verification is good defensive programming. Using MergeFrom for the status patch is the correct approach.


156-171: LGTM!

The function correctly identifies resources requiring migration based on PreviousVersions and constructs GVKs following upjet conventions.


1-1: License header year appears incorrect.

The copyright year is 2026, but current date is January 2026 and this is new code. Typically, copyright years reflect when the code was written. If this was written in 2025 or earlier during development, consider using the appropriate year.

⛔ Skipped due to learnings
Learnt from: CR
Repo: crossplane/upjet PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-13T21:29:32.267Z
Learning: Use Kubernetes-native patterns via crossplane-runtime
Learnt from: CR
Repo: crossplane/upjet PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-13T21:29:32.266Z
Learning: Applies to go.mod : Module path must be `github.com/crossplane/upjet/v2`

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

func PrepareCRDsMigrator(pc *Provider) {
var gvkList []schema.GroupVersionKind
for _, r := range pc.Resources {
if len(r.PreviousVersions) != 0 {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: just for heads-up, not because there is an immediate issue. This is also not actually specific to the changes in this PR.

I checked the usage of r.PreviousVersions in general. We solely rely on the provider (therefore the developer) to correctly configure it. There are no validations/cross-check around this configuration (especially with r.Version). e.g:

  • it is possible to forget configuring this: r.Version = "v1beta2" and r.PreviousVersions = [] (this might be still a valid scenario though after v1beta1 is actually removed)
  • no check for Version and PreviousVersions are disjoint sets. r.Version = "v1beta2" and r.PreviousVersions = ["v1beta1", "v1beta2"] possible
  • no check that PreviousVersions are behind the current Version r.Version = "v1beta2" and r.PreviousVersions = ["v1beta3", "v1beta4"]

Just noting these since it is getting a new feature around it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I also considered this, and I believe that having a complete version list, etc., as a new feature, may make sense. And this must be stored automatically without manual configuration.

origCrd := crd.DeepCopy()

crd.Status.StoredVersions = []string{storageVersion}
if err := kube.Status().Patch(ctx, &crd, client.MergeFrom(origCrd)); err != nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unfortunately, provider might need an extra RBAC here for patching CRDs, which is not ideal.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's right. I manually provided these roles to the provider ClusterRole and tested. Therefore, if we don't take this step, I believe it will be updated at some point. However, we can't validate the migration.

From an RBAC manager's perspective, granting this permission to all providers may not be a sensible approach. This is because many providers are unwilling to perform this operation. An option is to proceed without validating this part, i.e., patching the CRDs. This sounds better than granting this right to all providers.

Signed-off-by: Sergen Yalçın <yalcinsergen97@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants