Questions around Iceberg-rust

Hello, I had some questions around Iceberg-rust regarding data interactions with S3, authn, and authz.

1. How does connecting an Iceberg catalog with a specific S3 bucket work? I understand the structure on S3 with dividing a table into parquet data files and avro metadata files, but I am not sure how the relationship between this file organization and a deployed catalog works, and how to configure that exactly.

2. Where does Pyiceberg fit into Iceberg-rust? Would it be possible to deploy Iceberg-rust on the server side, and interact with the rest catalog through Pyiceberg? I like python as a nice interface for data consumers to interact with a catalog, and for basic management of tables.

3. What are the write table options with an Iceberg rust? As of now, is it only possible with a distributed engine like Spark or Trino? What would be the bottlenecks to duckdb, polars, or Ibis+backend writes? The vast majority of my datasets are less than 50Gb currently, and most workloads a fraction of that. I would like to use Iceberg for its superior data management vs files, but initially for use cases that can mostly be done on a single node and don't really need the power of distributed engines.

4. How does authentication and authorization work with the current Iceberg-rust? The access control system I described above works for AWS S3 and sharing files. Any pointers about where I could learn to integrate IAM permissions into a catalog and tables? It seems the creators of https://github.com/hansetag/iceberg-catalog are in the middle of implementing some of these exact features. I would love to contribute on these features and implement for my use case. It seems the way it works where non-AWS credentials are vended to consumers, and the catalog uses AWS credentials to sign S3 requests for the users, but I am not sure. I am also not sure how this implementation compares with the open-sourced implementation released by Databricks.

5. Where exactly does OpenDAL fit into the Iceberg-rust catalog? Would OpenDAL help standardize accessing data from the catalog? The custom metadata https://github.com/apache/opendal/issues/4842 feature could also be useful for connecting tables to different authz commands.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Questions around Iceberg-rust #450

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Questions around Iceberg-rust #450

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions