Skip to content

copy: add InstancePlatforms field for platform-based filtering#656

Open
aguidirh wants to merge 2 commits intocontainers:mainfrom
aguidirh:sparse-manifest-platforms-pr1938
Open

copy: add InstancePlatforms field for platform-based filtering#656
aguidirh wants to merge 2 commits intocontainers:mainfrom
aguidirh:sparse-manifest-platforms-pr1938

Conversation

@aguidirh
Copy link

Add platform-based filtering for copying specific images

This PR adds the ability to select images by platform name (e.g., linux/amd64) instead of requiring digest hashes, implementing the functionality proposed in containers/image#1938.

Motivation

Relates to #227

When copying specific images from a multi-architecture manifest list, the current CopySpecificImages mode requires users to specify exact digest hashes in the Instances field. This is cumbersome because:

  1. Poor User Experience: Users must manually look up digest values for each platform
  2. Error-Prone: Easy to copy wrong digest or get confused about which digest corresponds to which platform
  3. Not Intuitive: Most users think in terms of "amd64" or "arm64", not "sha256:abc123..."

This PR adds a new InstancePlatforms field that allows users to specify platforms by human-readable names.

Changes

1. New InstancePlatforms field in image/copy

  • Added InstancePlatforms []imgspecv1.Platform field to Options struct
  • Allows specifying platforms like {OS: "linux", Architecture: "amd64"}
  • Supports full platform specification including OS, Architecture, and Variant
  • Works alongside existing Instances field (both can be used simultaneously)

2. Platform resolution in determineSpecificImages() (image/copy/multiple.go)

  • New function that combines digest-based and platform-based selection
  • Uses ChooseInstanceByCompression() to find best match for each platform
  • Returns deduplicated Set of digests from both sources
  • Clear error messages when requested platform doesn't exist

3. Efficient Set-based filtering

  • Refactored from slices.Contains() to Set-based lookup
  • Improves performance from O(n) to O(1) per lookup
  • Added Size() method to internal/set/set.go

4. Comprehensive test coverage

  • Added TestDetermineSpecificImages with table-driven tests
  • Tests platform-only, digest-only, and combined selection
  • Validates error handling for non-existent platforms
  • All existing tests continue to pass

User Experience Improvement

Before (digest-based only):

options := &copy.Options{
    ImageListSelection: copy.CopySpecificImages,
    Instances: []digest.Digest{
        "sha256:e692418e4cbaf90ca69d05a66403747baa33ee08806650b51fab815ad7fc331f",  // Which platform?
        "sha256:5b0bcabd1ed22e9fb1310cf6c2dec7cdef19f0ad69efa1f392e94a4333501270",  // Which platform?
    },
}

After (platform-based):

options := &copy.Options{
    ImageListSelection: copy.CopySpecificImages,
    InstancePlatforms: []imgspecv1.Platform{
        {OS: "linux", Architecture: "amd64"},
        {OS: "linux", Architecture: "ppc64le"},
    },
}

Combined (both methods):

options := &copy.Options{
    ImageListSelection: copy.CopySpecificImages,
    Instances:         []digest.Digest{specificDigest},      // When you know the exact digest
    InstancePlatforms: []imgspecv1.Platform{                 // When you know the platform
        {OS: "linux", Architecture: "arm64"},
    },
}

Testing

All existing tests pass. New tests verify:

  • ✅ Platform-only selection
  • ✅ Digest-only selection (backward compatibility)
  • ✅ Combined platform and digest selection
  • ✅ Error handling for non-existent platforms
  • ✅ Set deduplication when same instance selected by both methods

Compatibility

  • Fully backward compatible - new optional field
  • ✅ Existing code using Instances field continues to work unchanged
  • ✅ No breaking changes to public API

Credits

This implementation is based on the original work by @nalind in containers/image#1938, adapted for the container-libs monorepo structure with the following changes:

  • Implemented all review feedback from @mtrmac:
    • All Set methods use pointer receivers
    • Complete platform specification (OS + Architecture + Variant)
    • Clear error messages for missing platforms
  • Added table-driven tests for better maintainability
  • Improved code comments and documentation

Add the ability to select images by platform name instead of requiring
digest hashes. This implements the functionality originally proposed in
containers/image#1938.

When copying specific images from a multi-architecture manifest list,
users currently must specify exact digest hashes. This is cumbersome and
error-prone, as users must manually look up digests and can easily
confuse which digest corresponds to which platform.

This commit adds a new InstancePlatforms field that allows users to
specify platforms by human-readable names like "linux/amd64" or
"linux/arm64". The field works alongside the existing Instances field,
allowing users to combine both methods.

The implementation includes:
1. New InstancePlatforms []imgspecv1.Platform field in Options
2. determineSpecificImages() function to resolve platforms to digests
   and combine with digest-based selection
3. Efficient Set-based filtering (O(1) lookup vs O(n) with slices)
4. Size() method added to internal/set for Set length queries
5. Table-driven tests covering all selection scenarios

Based on original work by @nalind in containers/image#1938, adapted for
the container-libs monorepo structure.

Relates to containers#227

Signed-off-by: Alex Guidi <aguidi@redhat.com>
@github-actions github-actions bot added the image Related to "image" package label Feb 11, 2026
@packit-as-a-service
Copy link

Packit jobs failed. @containers/packit-build please check.

1 similar comment
@packit-as-a-service
Copy link

Packit jobs failed. @containers/packit-build please check.

Copy link
Contributor

@mtrmac mtrmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

An extremely brief look for now.

ArchitectureChoice: platform.Architecture,
VariantChoice: platform.Variant,
}
instanceDigest, err := updatedList.ChooseInstanceByCompression(&platformContext, options.PreferGzipInstances)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hum… we have multi-compression images (where there is a gzip instance and a zstd instance for the same platform). This would only copy one of the instances.

OTOH a trivial “does the instance match the required Platform value” check might copy too much, because a v1 variant requirement would match a v1,v2,v3 instances.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the feedback! You're right that using ChooseInstanceByCompression() only copies one instance per platform, potentially missing multi-compression variants.

Proposed Solution

Default behavior: Copy ALL compression variants for platforms specified in InstancePlatforms. This preserves all available data for selected platforms, maintains the original digest for each instance, and avoids implicit filtering.

For compression-specific selection: Introduce a wrapper struct that allows optional compression filtering:

type PlatformSelection struct {
    Platform     imgspecv1.Platform
    Compressions []compression.Algorithm  // nil or empty = copy all compressions
}

type Options struct {
    // ... existing fields ...
    InstancePlatforms []PlatformSelection
}

Usage Examples

Copy all compressions (default, most common):

InstancePlatforms: []PlatformSelection{
    {Platform: {OS: "linux", Architecture: "amd64"}},  // Compressions nil = all
}

Filter to specific compressions:

InstancePlatforms: []PlatformSelection{
    {
        Platform: {OS: "linux", Architecture: "amd64"},
        Compressions: []compression.Algorithm{compression.Gzip},
    },
}

Different compressions per platform:

InstancePlatforms: []PlatformSelection{
    {
        Platform: {OS: "linux", Architecture: "amd64"},
        Compressions: []compression.Algorithm{compression.Gzip},
    },
    {
        Platform: {OS: "linux", Architecture: "arm64"},
        Compressions: nil,  // all compressions
    },
}

Rationale

  1. Better UX: Platform-based selection should be inclusive by default. Users shouldn't lose compression variants unless explicitly requested.

  2. Avoids digest lookup: The whole point of InstancePlatforms is convenience. While users could use the Instances field with exact digests for compression-specific control, this would require manually looking up digests for each compression variant—a poor user experience that defeats the purpose of platform-based selection.

  3. Clear semantics:

    • InstancePlatforms = broad selection (by platform)
    • Instances = precise selection (by digest)
    • Compressions field = optional filter
  4. Future extensibility: Using a wrapper struct that we control (rather than directly exposing imgspecv1.Platform) makes it easier to extend in the future. We can add new fields for additional filtering or options without being constrained by the external OCI spec types.

  5. Not a breaking change: Since InstancePlatforms is new in this PR, we can get the design right before it becomes public API.

Does this approach address your concerns? I'm happy to implement it if you agree with the direction.

}

require.NoError(t, err)
assert.Equal(t, tt.expectedSize, specificImages.Size())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can probably use assert.ElementsMatch, for a better error output on failure and perhaps smaller test code.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback, you're right, the test became smaller.

I just pushed a commit with your suggestion.

@TomSweeneyRedHat
Copy link
Member

I've approved this and restarted a couple of failng tests. I'd be interested in seeing your reply to @mtrmac 's comment on the multi-compression images.

Overall, a nice change, thanks!

Address code review feedback from @mtrmac:
- Use assert.ElementsMatch() instead of manual assertion loop
- Remove unnecessary expectedSize field from test struct

Signed-off-by: Alex Guidi <aguidi@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

image Related to "image" package

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants