SPEC: Add referenced-by in loadTable API #13810

singhpk234 · 2025-08-14T00:55:25Z

About the change

This change proposes referenced-by in loadTable request, which is expected to contain the FQN of the view (only 2 part identifier), which a rest catalog would expect from the client based on that it knows the table is being loaded in the context of view (view referencing the table) so that catalog can an action accordingly.

This would be really helpful in following cases :

Supporting the security for views, i.e definer / invoker mode :
Definer mode means the access to table should be authorized against the principal which created the view, this will be a very normal case where one would want to create a view and grant access to view but not to the underlying table.

Invoker mode means the access to table should be authorized against the principal which which is calling the loadTable essentially what happens in default.

when a view is defined in a definer mode security, referenced-by would help give catalog proper signal that this loadTable is happening in the context of view (i.e view is referencing the table) so the catalog can reverse lookup who the creator was and what the security mode is defined, and take proper authZ action.

Reference Implementation

[REST | SPARK]: Reference implementation of referenced-by in the loadTable call #13979

~~TODO: send a dev list thread.~~

singhpk234 · 2025-08-21T20:38:46Z

devlist thread - https://lists.apache.org/thread/01gb9rygdd1gqks7lnl1o6440qocnh9m

open-api/rest-catalog-open-api.yaml

flyrain · 2025-09-05T01:11:30Z

open-api/rest-catalog-open-api.yaml

            type: string
            enum: [ all, refs ]
+        - in: query
+          name: referenced-by


Can we call it something like referenced-by-view or from-view since it can only be referenced by a view here? I understand that the description already explicitly says that it's a view, but people can still misuse it given a generic name like referenced by.

flyrain

LGTM. Thanks @singhpk234 !

gaborkaszab

One comment about the expected behavior when the referenced-by view doesn't exist.

open-api/rest-catalog-open-api.yaml

github-actions · 2025-10-24T00:15:02Z

This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the dev@iceberg.apache.org list. Thank you for your contributions.

github-actions · 2025-10-31T00:17:25Z

This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If you think that is incorrect, or the pull request requires review, you can revive the PR at any time.

open-api/rest-catalog-open-api.yaml

gaborkaszab

I think this is fine, just left some nits or questions

open-api/rest-catalog-open-api.yaml

qqqttt123 · 2026-01-29T07:30:26Z

Thanks for your great work! Although we should trust the clients, I still have some concern about the security risk because it relies on client-provided information without server-side validation. The server cannot verify if the referenced view actually exists or legitimately references the table being loaded, potentially allowing permission bypasses.

For example:
Tom creates a view A using SQL, has the privilege to read table B and table C.

CREATE VIEW A 
SQL SECURITY DEFINER
AS SELECT * FROM B WHERE department = 'Engineering';

Jerry has the privilege to read the view A, if he mocks a request load table C referenced by view A, it will produce a security risk.

We would better add more constraints in the createVIew request. We should add required table identifiers in the request. So we can validate the table identifiers in the server side when loading tables referenced by the view.

singhpk234 · 2026-01-29T07:47:07Z

@qqqttt123 thank you for the feedback !
I believe in order for the catalog to authorize the access to the table based on the clients input, its requires a notion of trust between catalog and the client, we neither define AuthZ in IRC nor we define notion of Trusted Engine in IRC, hence how the catalog wanna authorize the access to the table based on reference-by is not intentionally described here.

We covered this discussion in here as well as in the design doc here as proposal mentions its first step towards enable, there will subsequent work required in the catalog end to complete E2E

Please let us know if it helps answer your concern !

qqqttt123 · 2026-01-29T08:13:41Z

@qqqttt123 thank you for the feedback ! I believe in order for the catalog to authorize the access to the table based on the clients input, its requires a notion of trust between catalog and the client, we neither define AuthZ in IRC nor we define notion of Trusted Engine in IRC, hence how the catalog wanna authorize the access to the table based on reference-by is not intentionally described here.

We covered this discussion in here as well as in the design doc here as proposal mentions its first step towards enable, there will subsequent work required in the catalog end to complete E2E

Please let us know if it helps answer your concern !

Yes, it helps a lot. Thanks. We shouldn't involve the authorization concepts for the spec.

I found that Trino add ICEBERG_VIEW_RUN_AS_OWNER as the property of the view. Maybe we should add table identifiers of the view to the properties, too. It will benefits authorization of the Iceberg REST catalog.

open-api/rest-catalog-open-api.yaml

- Rename parameter from 'referenced-list' to 'referenced-by' - Update namespace separator docs to align with PR apache#14448 - Clarify parsing rules for dot separator between namespace and view name - Document comma separator for multiple view identifiers - Add example showing multiple comma-separated views - Specify URL encoding for commas in view names (%2C) Format: ?referenced-by=namespace.view1,namespace.view2 where namespace parts use configurable separator (default %1F) Separators used: - %1F (or configured): between namespace parts - . (dot): between namespace and view name - , (comma): between multiple view identifiers

rdblue · 2026-01-30T16:45:54Z

open-api/rest-catalog-open-api.yaml

+          name: referenced-by
+          description:
+            A comma-separated list of fully qualified view names (namespace and view name) representing the view
+            reference chain when a table is loaded via a view. The list should be ordered with the outermost view


Why do we need the reference chain and not the last view to reference the table? Is there some case that we can't handle with just the direct reference?

Its for the cases when the direct referencing view is not a definer view but some view in the chain might be a DEFINER view ? Hence we decided to send the whole reference chain instead of just the directly referencing view.

we discussed this in the catalog sync as well, Notes : Link

I think I was there, but I don't remember there being a specific conclusion to that question. I think we need to be clear about exactly what this is for before moving forward so that we know that this requires the full reference list.

If I understand correctly, the idea is that you have a nested view that is run as INVOKER within a view run as DEFINER. If the inner view were run as DEFINER, we would not need any references from outside of that view because it is self-contained, so it must be that we have an INVOKER within a DEFINER. Then the question is how to handle the inner INVOKER: is the definer of the outer view considered the invoker of the inner view? If so, we need to know about the outer view.

This may seem like an obvious choice, but always using the access privileges from the original query is how PostgreSQL works:

If the view has the security_invoker property set to true, access to the underlying base relations is determined by the permissions of the user executing the query, rather than the view owner. Thus, the user of a security invoker view must have the relevant permissions on the view and its underlying base relations.

If any of the underlying base relations is a security invoker view, it will be treated as if it had been accessed directly from the original query. Thus, a security invoker view will always check its underlying base relations using the permissions of the current user, even if it is accessed from a view without the security_invoker property.

If we used this interpretation, that the INVOKER is not context-dependent, then we would have a simpler spec and behavior (DEFINER=definer's permissions, INVOKER=query permissions), but at the cost of removing flexibility for catalog implementers.

I'm trying to think whether this is a situation where the simpler option would be too restrictive. I think the case where you would want to leave the inner view as INVOKER is when you want both INVOKER and DEFINER behavior when accessed directly (INVOKER) and indirectly (DEFINER). That seems like a weird pattern with easy work-arounds to me: if you want to delegate when accessing indirectly, you can embed the INVOKER view SQL in the DEFINER view.

I think I'd vote for the simpler option, but I'm interested in a bit more discussion on this before we finalize it.

Do we have any prior art for engines/frameworks/systems that allow you override a downstream invoker?

This scenario to be concrete

Definer => Invoker => Table Definer View made by User A Table is Accesible to User A User B queries definer view

Does User B see rows from the table?

From Ryan's quote this doesn't seem like it would be allowed in postgres

I was checking Trino and it doesn't look like it supports this either.

In the INVOKER security mode, tables referenced in the view are accessed using the permissions of the user executing the query (the invoker of the view). A view created in this mode is simply a stored query.

Snowflake doesn't support Invoker views, so it's always Definer

Not that I want to forbid a catalog being able to do this, but I think it would be helpful to know if anyone actually plans on allowing this pattern? Feels like it would be a security hole?

Are there any other use cases?

Feels like it would be a security hole?

I'm not sure I follow the case where this could be a security hole. Any time you get the permissions of a DEFINER, you must have access to the DEFINER view. Wouldn't it be strange if the catalog's intent was to nest an INVOKER view inside a DEFINER view in order to protect data referenced by the INVOKER? And I don't think it's a hole if that's the case because the catalog is what gets to decide (at least with the referenced-by chain) what the behavior is.

Sorry, I mean more like a practice I would not encourage.

I feel like if you have an invoker, that is a signal that the data behind the invoker is very sensitive and only certain users should have it. If you add a definer on top of the invoker view it's like you are bypassing that. Better to build a definer view in the first place

open-api/rest-catalog-open-api.yaml

github-actions bot added the OPENAPI label Aug 14, 2025

singhpk234 force-pushed the feature/loaded-via-view branch from fe3b4cb to c66b870 Compare August 14, 2025 01:43

github-actions bot added core spark labels Aug 14, 2025

singhpk234 force-pushed the feature/loaded-via-view branch 5 times, most recently from ac2ecee to afdb92b Compare August 17, 2025 23:25

singhpk234 changed the title ~~SPEC: Add loaded-via in loadTable API~~ SPEC: Add referenced-by in loadTable API Aug 21, 2025

sungwy reviewed Aug 22, 2025

View reviewed changes

open-api/rest-catalog-open-api.yaml Outdated Show resolved Hide resolved

singhpk234 added the Specification Issues that may introduce spec changes. label Aug 28, 2025

github-actions bot added the API label Sep 2, 2025

singhpk234 force-pushed the feature/loaded-via-view branch 4 times, most recently from 094b0f0 to 89e088b Compare September 3, 2025 00:59

github-actions bot removed the API label Sep 3, 2025

singhpk234 mentioned this pull request Sep 3, 2025

[REST | SPARK]: Reference implementation of referenced-by in the loadTable call #13979

Open

singhpk234 marked this pull request as ready for review September 5, 2025 01:04

flyrain reviewed Sep 5, 2025

View reviewed changes

flyrain approved these changes Sep 5, 2025

View reviewed changes

gaborkaszab reviewed Sep 5, 2025

View reviewed changes

open-api/rest-catalog-open-api.yaml Show resolved Hide resolved

rdblue reviewed Sep 10, 2025

View reviewed changes

open-api/rest-catalog-open-api.yaml Outdated Show resolved Hide resolved

singhpk234 force-pushed the feature/loaded-via-view branch from 89e088b to d62e21c Compare September 17, 2025 17:47

nastra reviewed Sep 22, 2025

View reviewed changes

open-api/rest-catalog-open-api.yaml Outdated Show resolved Hide resolved

github-actions bot added the stale label Oct 24, 2025

nastra reviewed Jan 13, 2026

View reviewed changes

open-api/rest-catalog-open-api.yaml Outdated Show resolved Hide resolved

nastra reviewed Jan 13, 2026

View reviewed changes

open-api/rest-catalog-open-api.yaml Outdated Show resolved Hide resolved

nastra reviewed Jan 13, 2026

View reviewed changes

open-api/rest-catalog-open-api.yaml Outdated Show resolved Hide resolved

nastra approved these changes Jan 13, 2026

View reviewed changes

gaborkaszab reviewed Jan 13, 2026

View reviewed changes

open-api/rest-catalog-open-api.yaml Outdated Show resolved Hide resolved

open-api/rest-catalog-open-api.yaml Show resolved Hide resolved

singhpk234 requested review from gaborkaszab and nastra January 26, 2026 17:42

nastra approved these changes Jan 27, 2026

View reviewed changes

adnanhemani approved these changes Jan 27, 2026

View reviewed changes

gaborkaszab approved these changes Jan 28, 2026

View reviewed changes

huaxingao approved these changes Jan 28, 2026

View reviewed changes

stevenzwu approved these changes Jan 28, 2026

View reviewed changes

qqqttt123 approved these changes Jan 29, 2026

View reviewed changes

danielcweeks reviewed Jan 29, 2026

View reviewed changes

open-api/rest-catalog-open-api.yaml Outdated Show resolved Hide resolved

sfc-gh-prsingh added 4 commits January 29, 2026 14:18

spec change only

561ac75

Address review feedback-part1

f615a39

reorder the view example to make it logically consistent

430b5bd

sfc-gh-prsingh force-pushed the feature/loaded-via-view branch from 4e98998 to 430b5bd Compare January 29, 2026 22:18

singhpk234 requested a review from danielcweeks January 29, 2026 22:36

danielcweeks approved these changes Jan 29, 2026

View reviewed changes

rdblue reviewed Jan 30, 2026

View reviewed changes

open-api/rest-catalog-open-api.yaml Outdated Show resolved Hide resolved

rdblue reviewed Jan 30, 2026

View reviewed changes

open-api/rest-catalog-open-api.yaml Outdated Show resolved Hide resolved

rdblue reviewed Jan 30, 2026

View reviewed changes

open-api/rest-catalog-open-api.yaml Outdated Show resolved Hide resolved

sfc-gh-prsingh added 2 commits January 30, 2026 10:05

Address feedback part-1 from ryan

908fca6

Address feedback of ryan part-2

e615deb

SPEC: Add referenced-by in loadTable API #13810

Are you sure you want to change the base?

SPEC: Add referenced-by in loadTable API #13810

Conversation

singhpk234 commented Aug 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

About the change

Reference Implementation

Uh oh!

singhpk234 commented Aug 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

flyrain Sep 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

flyrain left a comment

Choose a reason for hiding this comment

Uh oh!

gaborkaszab left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Oct 24, 2025

Uh oh!

github-actions bot commented Oct 31, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gaborkaszab left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

qqqttt123 commented Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

singhpk234 commented Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

qqqttt123 commented Jan 29, 2026

Uh oh!

Uh oh!

rdblue Jan 30, 2026

Choose a reason for hiding this comment

Uh oh!

singhpk234 Jan 30, 2026

Choose a reason for hiding this comment

Uh oh!

rdblue Jan 30, 2026

Choose a reason for hiding this comment

Uh oh!

RussellSpitzer Jan 30, 2026

Choose a reason for hiding this comment

Uh oh!

rdblue Jan 30, 2026

Choose a reason for hiding this comment

Uh oh!

RussellSpitzer Jan 30, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

13 participants

singhpk234 commented Aug 14, 2025 •

edited

Loading

singhpk234 commented Aug 21, 2025 •

edited

Loading

flyrain Sep 5, 2025 •

edited

Loading

qqqttt123 commented Jan 29, 2026 •

edited

Loading

singhpk234 commented Jan 29, 2026 •

edited

Loading