Skip to content

Conversation

@singhpk234
Copy link
Contributor

@singhpk234 singhpk234 commented Sep 3, 2025

About the change

This provides a reference implementation for passing the view name that table is referenced in as part of which the loadTable call is being made.

Details on the spec change proposal and motivation here

Copy link
Collaborator

@gaborkaszab gaborkaszab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @singhpk234 !
I took a look for my own benefit, and left some observations and questions meanwhile. Let me know what you think!

* @throws NoSuchTableException if the table does not exist
*/
Table loadTableViaView(TableIdentifier identifier, Map<String, Object> context)
throws NoSuchTableException;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if the referencing view doesn't exist? Shouldn't we check for the referencing view and throw NoSuchViewException, or should we silently omit if the view doesn't exist?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the concern that its a public API, that we should validate this ? My intention was to treat this as a runtime property, I renamed this API to loadTableWithContext, please let me know if this helps addressing this concern.

* @return the loaded table
* @throws NoSuchTableException if the table does not exist
*/
Table loadTableViaView(TableIdentifier identifier, Map<String, Object> context)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we expect more context params to this function? Can't we have a simple String parameter to hold the name of the view? Or Maybe a TableIdentifier?

Copy link
Contributor Author

@singhpk234 singhpk234 Sep 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we expect more context params to this function

That was the intention, to leave door open to pass more runtime context to the catalog, i was trying to model this as something like we already do i.e SessionContext this is a more runtime context so that engines can plumb their context back to the catalog and then catalog client can use this to make subsequent calls to the server. Hence didn't just have view identifier in the signature.


@Override
public Table loadTableViaView(TableIdentifier identifier, Map<String, Object> viewContext) {
if (delegate instanceof ContextAwareTableCatalog) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delegate is a BaseSessionCatalog.AsCatalog that now implements ContextAwareTableCatalog. Is there a use-case or configuration where delegate doesn't implement the new interface?

I see the concept in this class is rather to have a new member of type ContextAwareTableCatalog that is converted from delegate and this new function could directly call that. Similarly as nsCatalog.

Would this also work here? Do I miss something?

}

@Override
public Table loadTableViaView(Identifier ident, String version, Map<String, Object> context)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as in core/: Now a loadTableViaView function performs the load table even not via view, that seems a bit weird.

Copy link
Contributor Author

@singhpk234 singhpk234 Sep 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed it to loadTableWithContext where context would be a runtime context, let me know does is sounds better.

String[] namespaceParts = new String[parts.length - 1];
System.arraycopy(parts, 0, namespaceParts, 0, parts.length - 1);
String viewName = parts[parts.length - 1];
queryParams.put(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for my understanding:
Let's take the following input: "ns1.ns2.view1"
We convert this to "ns1%1Fns2%1F.view1"
Is this because we wan't to tell the REST catalog which part is the namespace and which is the view name? If we include a dot in the query params, can't we just send the view path as it is with dots and let the catalog server figure out the namespace/view name part that we do here? Or can we also use the same special char between the namespace and the view name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one way to look at this is anything from the last . to the end is the view name and the every thing before it is the namespace, the reason %1F was chosen the separator was to use something similar, the reason i choose the %1F is the language of the existing rest spec for the nested namespaces.

I think we are yet to conclude on the separator discussion.

@singhpk234 singhpk234 force-pushed the feature/ref-imp-referenced-by branch 2 times, most recently from 97296ab to bcec3f2 Compare September 8, 2025 00:22
@sfc-gh-prsingh sfc-gh-prsingh force-pushed the feature/ref-imp-referenced-by branch from a34c0f2 to fb6e824 Compare November 3, 2025 12:12
@singhpk234 singhpk234 changed the title Spark, Rest: Reference implementation of referenced-by in the loadTable call [REST | SPARK]: Reference implementation of referenced-by in the loadTable call Nov 3, 2025
@sfc-gh-prsingh sfc-gh-prsingh force-pushed the feature/ref-imp-referenced-by branch from 20d30b3 to 737fc50 Compare November 3, 2025 12:31
@github-actions
Copy link

github-actions bot commented Dec 4, 2025

This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the dev@iceberg.apache.org list. Thank you for your contributions.

@sfc-gh-prsingh sfc-gh-prsingh force-pushed the feature/ref-imp-referenced-by branch 2 times, most recently from 3271d52 to f368355 Compare January 11, 2026 05:30
@sfc-gh-prsingh sfc-gh-prsingh force-pushed the feature/ref-imp-referenced-by branch from f368355 to 57e3e30 Compare January 11, 2026 05:39
Copy link
Collaborator

@gaborkaszab gaborkaszab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went through the api/ and core/ part for now (haven't checked Spark). Left some questions and nits. I also found some of my previous comments so I'm not sure this was ready for review.
Anyway, some test coverage in TestRESTCatalog would be great!

* <p>Common context keys:
*
* <ul>
* <li>{@link #VIEW_IDENTIFIER_KEY}: A List&lt;TableIdentifier&gt; of view(s) referencing this
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it necessary to escape the lt and gt chars?
"List<TableIdentifier>"


default Table loadTableWithContext(
SessionContext sessionContext, TableIdentifier ident, Map<String, Object> context) {
return loadTable(sessionContext, ident);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you considered throwing UnsupportedOperationException instead of silently ignore the context param?


@Override
public Table loadTable(TableIdentifier identifier, Map<String, Object> loadingContext) {
return BaseSessionCatalog.this.loadTableWithContext(context, identifier, loadingContext);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe consider renaming context to sessionContext to avoid confusion with the new type of context?

.get(
paths.table(identifier),
snapshotModeToParam(mode),
referencedByToQueryParam(snapshotModeToParam(mode), viewContext),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: here at the calcite the reader could have the expression that the order of referencedByToQueryParam and snapshotModeToParam is interchangeable, but in practice it isn't. Maybe having a separate function to paramsForLoadTable that produces both?

Map<String, String> queryParams = Maps.newHashMap(params);
Object viewIdentifierObj = context.get(ContextAwareTableCatalog.VIEW_IDENTIFIER_KEY);

if (!(viewIdentifierObj instanceof List)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe Preconditions.checkArgument(viewIdentifierObj instanceof List, error msg)

}

@SuppressWarnings("unchecked")
List<TableIdentifier> viewIdentifiers = (List<TableIdentifier>) viewIdentifierObj;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we include a check into the above that not just check for List type but for List? Wouldn't this result in a ClassCastException or something similar if we input List that contains anything other than TableIdentifier?

}

// Format per REST API spec:
// - Namespace parts joined by namespace separator (0x1F, URL-encoded as %1F)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

separator is configurable now, the comment now is invalid

// - Commas in view names must be percent-encoded as %2C
List<String> formattedViews = Lists.newArrayListWithCapacity(viewIdentifiers.size());
for (TableIdentifier viewId : viewIdentifiers) {
// Encode namespace using RESTUtil which handles the namespace separator
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I don't think these comments in the for loop are necessary. The code seems pretty straightforward and the longer comment above the loop also helps


@Override
public Table loadTable(SessionContext context, TableIdentifier identifier) {
public Table loadTableWithContext(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this had been discussed already, but can't this be simply loadTable? The params will tell us anyway that this is with context.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants