Skip to content

Conversation

@kalbasit
Copy link
Owner

@kalbasit kalbasit commented Jan 31, 2026

This commit adds a new S3-compatible storage backend for content-defined chunks:

  • Implemented s3Store using the MinIO Go SDK.
  • Support for Get, Put, HasChunk, and Delete operations on S3.
  • Added integration tests for S3 storage (runnable with MinIO).
  • Transparently handles path-style and virtual-hosted style buckets.

Part of #322

@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. enhancement New feature or request go Pull requests that update go code labels Jan 31, 2026
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @kalbasit, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the storage capabilities by introducing an S3-compatible backend for content-defined chunks. This allows the system to store and retrieve data chunks from cloud storage services, providing a more distributed and resilient storage option. The implementation includes all necessary CRUD operations for chunks and is thoroughly tested with integration tests.

Highlights

  • New S3 Storage Backend: Introduced a new S3-compatible storage backend for content-defined chunks, enabling scalable and robust storage solutions.
  • MinIO Go SDK Integration: The S3 storage implementation leverages the MinIO Go SDK for interacting with S3-compatible services.
  • Core Chunk Operations: Full support for essential chunk operations including Get, Put, HasChunk, and Delete has been implemented for the S3 backend.
  • Flexible Bucket Addressing: The S3 store transparently handles both path-style and virtual-hosted style buckets, providing flexibility in S3 endpoint configurations.
  • Integration Tests: Comprehensive integration tests have been added for the S3 storage, designed to be runnable with a MinIO instance to ensure functionality and reliability.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces S3 storage support for chunks, which is a great feature. The implementation is clean and includes integration tests. I've identified a couple of areas for improvement: one related to making the S3 endpoint URL parsing more robust, and another concerning a race condition in the PutChunk method, which is acceptable for now given the current application context and documentation as technical debt. My comments provide specific suggestions to address these points.

@kalbasit kalbasit force-pushed the 01-30-feat_add_s3_storage_support_for_chunks branch 2 times, most recently from 1643466 to 8bb122a Compare January 31, 2026 03:02
@kalbasit kalbasit force-pushed the 01-30-feat_add_core_content-defined_chunking_cdc_infrastructure branch from 844b061 to 8b0a778 Compare January 31, 2026 06:03
@kalbasit kalbasit force-pushed the 01-30-feat_add_s3_storage_support_for_chunks branch from 8bb122a to 29bc630 Compare January 31, 2026 06:03
@kalbasit kalbasit changed the base branch from 01-30-feat_add_core_content-defined_chunking_cdc_infrastructure to graphite-base/668 January 31, 2026 06:25
@kalbasit kalbasit force-pushed the 01-30-feat_add_s3_storage_support_for_chunks branch from 29bc630 to b92988e Compare January 31, 2026 06:33
@kalbasit kalbasit changed the base branch from graphite-base/668 to 01-30-feat_add_core_content-defined_chunking_cdc_infrastructure January 31, 2026 06:33
@kalbasit kalbasit force-pushed the 01-30-feat_add_s3_storage_support_for_chunks branch from b92988e to e7a90b5 Compare January 31, 2026 06:46
@kalbasit kalbasit force-pushed the 01-30-feat_add_core_content-defined_chunking_cdc_infrastructure branch from 7a370b5 to a033c74 Compare January 31, 2026 20:23
@kalbasit kalbasit force-pushed the 01-30-feat_add_s3_storage_support_for_chunks branch 4 times, most recently from 2a45a7a to 736ed2e Compare January 31, 2026 23:39
Base automatically changed from 01-30-feat_add_core_content-defined_chunking_cdc_infrastructure to main January 31, 2026 23:49
@dosubot dosubot bot added size:XXL This PR changes 1000+ lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Jan 31, 2026
@kalbasit kalbasit force-pushed the 01-30-feat_add_s3_storage_support_for_chunks branch from 736ed2e to 87d7b35 Compare January 31, 2026 23:52
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:XXL This PR changes 1000+ lines, ignoring generated files. labels Jan 31, 2026
@kalbasit
Copy link
Owner Author

/gemini review

@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:XXL This PR changes 1000+ lines, ignoring generated files. labels Feb 1, 2026
@kalbasit kalbasit changed the base branch from graphite-base/668 to refactor/s3-test-helpers February 1, 2026 02:32
@kalbasit kalbasit force-pushed the refactor/s3-test-helpers branch from 1f9be6a to ef1a6b7 Compare February 1, 2026 02:51
@kalbasit kalbasit force-pushed the 01-30-feat_add_s3_storage_support_for_chunks branch 2 times, most recently from b028cf2 to 29af693 Compare February 1, 2026 02:56
@kalbasit kalbasit force-pushed the refactor/s3-test-helpers branch from ef1a6b7 to 094773e Compare February 1, 2026 02:56
@kalbasit kalbasit force-pushed the 01-30-feat_add_s3_storage_support_for_chunks branch 2 times, most recently from 27650ba to 3066fbd Compare February 1, 2026 03:08
@kalbasit kalbasit force-pushed the refactor/s3-test-helpers branch from da6a58a to 721eab8 Compare February 1, 2026 03:08
@kalbasit kalbasit force-pushed the 01-30-feat_add_s3_storage_support_for_chunks branch 2 times, most recently from e7890b6 to 64c004f Compare February 1, 2026 03:14
Base automatically changed from refactor/s3-test-helpers to main February 1, 2026 03:25
@dosubot dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Feb 1, 2026
@kalbasit kalbasit force-pushed the 01-30-feat_add_s3_storage_support_for_chunks branch from 64c004f to 7983dd4 Compare February 1, 2026 03:55
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:XL This PR changes 500-999 lines, ignoring generated files. labels Feb 1, 2026
@kalbasit
Copy link
Owner Author

kalbasit commented Feb 1, 2026

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces S3-compatible storage for content-defined chunks, which is a great addition. The implementation in pkg/storage/chunk/s3.go is well-structured, using the MinIO Go SDK and incorporating locking to prevent race conditions during chunk uploads. The integration tests are comprehensive and cover the main use cases, including race conditions.

I have a couple of suggestions to improve maintainability and robustness in the S3 store implementation. One is to avoid a hardcoded TTL for locks, and the other is to make the error handling for non-existent chunks more consistent. Please see the detailed comments.

@dosubot dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Feb 1, 2026
@kalbasit kalbasit enabled auto-merge (squash) February 1, 2026 04:07
@kalbasit kalbasit merged commit 4b31a12 into main Feb 1, 2026
14 checks passed
@kalbasit kalbasit deleted the 01-30-feat_add_s3_storage_support_for_chunks branch February 1, 2026 04:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request go Pull requests that update go code size:XL This PR changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants