Skip to content

Conversation

@landreev
Copy link
Contributor

@landreev landreev commented Nov 20, 2025

What this PR does / why we need it:

This is a low-hanging fruit/easily-achievable part of the parent issue #11529 that will make it possible to configure quotas on individual datasets. The primary use case is datasets in root and/or another collection where defining a collection-level quota may be impractical. The functionality was already there - the implementation underneath operates on DvObjectContainers; it was in fact already possible to create a dataset-level quota by directly inserting an entry into the storagequota table. This PR exposes the functionality via the new APIs under /api/datasets/.

The part of the parent issue that will require more design and dev., configurable quotas for users, remains to be prioritized separately

Which issue(s) this PR closes:

Special notes for your reviewer:

Note that as of right now (11 am Nov. 21) the checklist below says that the continuous integration test has failed ("The build of this commit was aborted"). But this is because I killed a redundant extra Jenkins run that was triggered by a typo fix in the release note. The last Jenkins run did actually pass - see https://jenkins.dataverse.org/job/IQSS-Dataverse-Develop-PR/view/change-requests/job/PR-11997/.

Suggestions on how to test this:

Straightforward. Enable a quota on a dataset (see the section of the API guide and/or the FilesIT test that are being added here), make sure uploads are blocked once it is reached.

Does this PR introduce a user interface change? If mockups are available, please link/include them here:

Is there a release notes update needed for this change?:

Additional documentation:

@github-actions github-actions bot added FY26 Sprint 10 FY26 Sprint 10 (2025-11-05 - 2025-11-19) Size: 10 A percentage of a sprint. 7 hours. labels Nov 20, 2025
@landreev landreev moved this to Ready for Triage in IQSS Dataverse Project Nov 20, 2025
@coveralls
Copy link

coveralls commented Nov 20, 2025

Coverage Status

coverage: 24.014% (-0.02%) from 24.037%
when pulling 4c61b5e on 11987-storage-quotas-on-datasets
into b8f5c1e on develop.

@github-actions

This comment has been minimized.

@cmbz cmbz moved this from Ready for Triage to Ready for Review ⏩ in IQSS Dataverse Project Nov 20, 2025
@cmbz cmbz added this to the 6.9 milestone Nov 20, 2025
…was "ManagePermissions" in the original collection-level command, which I initially copied into the new dataset equivalent. But I don't think it was the right, or even an intentional choise. #11987
@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

1 similar comment
@github-actions

This comment has been minimized.

@stevenwinship stevenwinship self-assigned this Nov 21, 2025
@stevenwinship stevenwinship moved this from Ready for Review ⏩ to In Review 🔎 in IQSS Dataverse Project Nov 21, 2025
StorageQuota storageQuota = target.getStorageQuota();

if (storageQuota != null) {
storageQuota.setAllocation(allocation);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any check to prevent negative numbers? I guess 0 or negative will just block the ability to store more data.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is the answer - it is possible to set it to a negative number. But it will be equivalent to zero in practice.
Tbh, I am generally less concerned about preventing invalid or meaningless values from being entered when it comes to superusers-only APIs. Under the assumption that they should know what they are doing, and/or can be expected to own the consequences of their actions.

@github-actions

This comment has been minimized.

1 similar comment
@github-actions

This comment has been minimized.

@cmbz cmbz added the FY26 Sprint 11 FY26 Sprint 11 (2025-11-20 - 2025-12-03) label Nov 22, 2025
@github-actions

This comment has been minimized.

@landreev
Copy link
Contributor Author

landreev commented Dec 2, 2025

... This appears to be caused by delete just setting the local quota allocation to null rather than removing the StorageQuota

I did notice that myself the other day. I may have even done that on purpose... but the behavior is inconsistent/counter-intuitive, I agree.

@landreev
Copy link
Contributor Author

landreev commented Dec 2, 2025

The /storageDriver?showRemainingQuotas api call (already being changed) shows no new keys if quota not set - with the replacement api should there be a positive indication in the json that there are no quotas in that case?

That was also part of my rationale behind shoving this limits info into an existing API (optionally, when present); vs. figuring out how to communicate that there are no limits, in an API dedicated to showing the limits.
My current suggestion (1fbbff0) is to have a dedicated API:

/api/datasets/NNNNN/uploadlimits
{
  "status": "OK",
  "data": {
    "uploadLimits": {
      "numberOfFilesRemaining": 8,
      "storageQuotaRemaining": 1046483
    }
  }
}

or, in the absence of any limits

{
  "status": "OK",
  "data": {
    "uploadLimits": {}
  }
}

Would this be positive enough?

@landreev
Copy link
Contributor Author

landreev commented Dec 2, 2025

Unlike size quotas, file count limits can't be set on collections/inherited. Is that by design or something that could/should be added at some point?

I didn't implement those... but I am 99.9% positive that they are in fact inherited, from the nearest container with a limit defined. ... will double-check.

@landreev
Copy link
Contributor Author

landreev commented Dec 2, 2025

File limits have no setting, size quotas do: :UseStorageQuotas - is a setting needed? Should it cover both file/size limits

Yeah, this seems somewhat arbitrary; in practice it is simply a result of these two limits having been implemented by different developers. Although I should have thought of making them more consistent when reviewing the file counts PR.

I personally feel like having a global setting is useful - for the sake of being able to flip the enforcement on and off, without losing the configuration. Would probably vote for having 2 separate settings for enforcing these.

@github-actions

This comment has been minimized.

@landreev
Copy link
Contributor Author

landreev commented Dec 2, 2025

Overall, I would prefer to have a separate PR for following up on many of the suggestions above, rather than trying to address them here on a short notice (again, this was supposed to be a "low effort" PR).
With some exceptions - I will go ahead and address this:

The pattern of using a POST with a numeric path variable, used in this PR and earlier ones, seems to be ~non-restful -

I don't know what I was thinking there really... Did I copy-and-paste that solution from some other place in the API ?? - idk.
And if we want it to be changed to a more reasonable -X PUT -d '...', I feel like now's the good, or even last chance - up until now no one has been using these APIs in real life (afaik), but we are actively instituting quotas in prod. here, and I am in the process of passing the command lines for the APIs to the support/curation team.

Somewhat on the fence about the -X DELETE /api/data*/{id}/storage/quota behavior. I agree that it's not ideal as implemented; but it's not as urgent for us, since our quotas will be defined only on collections and individual datasets immediately in the root collection. I.e., no dataset should have a quota defined on more than one ancestor. ... but may as well.

This I find very troubling:

For a non-superuser, adding more files than allowed by quota results in a subset of files being added (with no warning that all files won't be added)
For a non-superuser, adding a file over the limit results in the file simply disappearing in the upload pane (again w/o any explanation)

I'm quite positive that both mine, and Steven's implementations did have such warning messages in the JSF UI. Did they get muted by later changes to the page? - idk. I want this fixed, but in a separate pr.

@landreev landreev self-assigned this Dec 3, 2025
@cmbz cmbz added the FY26 Sprint 12 FY26 Sprint 12 (2025-12-03 - 2025-12-17) label Dec 3, 2025
@github-actions

This comment has been minimized.

1 similar comment
@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@landreev
Copy link
Contributor Author

landreev commented Dec 4, 2025

The changes made per QA feedback:

  • /storageDriver changes rolled back (branch synced w/ Fixes for #11716 #11940 now in develop);
  • dedicated convenience API /uploadlimits added instead;
  • the collection/dataset set quota APIs changed to use PUT;
  • the release note, the guides and the RestAssured tests changed accordingly.

I made an attempt to make -X DELETE remove the quota, instead of setting allocation to null; for some unclear reason that same code was bombing for datasets (??) - so I left it unchanged in both cases, for now.

@landreev landreev removed their assignment Dec 4, 2025
@ekraffmiller ekraffmiller added GREI Re-arch Issues related to the GREI Dataverse rearchitecture SPA These changes are required for the Dataverse SPA labels Dec 4, 2025
@github-actions

This comment has been minimized.

@github-actions
Copy link

github-actions bot commented Dec 4, 2025

📦 Pushed preview images as

ghcr.io/gdcc/dataverse:11987-storage-quotas-on-datasets
ghcr.io/gdcc/configbaker:11987-storage-quotas-on-datasets

🚢 See on GHCR. Use by referencing with full name as printed above, mind the registry name.

@qqmyers
Copy link
Member

qqmyers commented Dec 4, 2025

FWIW: I tested the size and file quota add/delete apis and verified that the new /uploadlimits api correctly reports when a file or size limit, or both, exist, and returns {"status":"OK","data":{"uploadLimits":{}}} when they don't.

I see a sematic API IT test failing in the latest build - I've seen that in other PRs - looks like an intermittent issue with the order of two description fields that I'll investigate, but I'll go ahead and merge this as it shouldn't be related.

Copy link
Member

@qqmyers qqmyers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Latest changes look good

@github-project-automation github-project-automation bot moved this from QA ✅ to Ready for QA ⏩ in IQSS Dataverse Project Dec 4, 2025
@qqmyers qqmyers merged commit d2b6a46 into develop Dec 4, 2025
16 of 17 checks passed
@github-project-automation github-project-automation bot moved this from Ready for QA ⏩ to Merged 🚀 in IQSS Dataverse Project Dec 4, 2025
@scolapasta scolapasta moved this from Merged 🚀 to Done 🧹 in IQSS Dataverse Project Dec 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

FY26 Sprint 10 FY26 Sprint 10 (2025-11-05 - 2025-11-19) FY26 Sprint 11 FY26 Sprint 11 (2025-11-20 - 2025-12-03) FY26 Sprint 12 FY26 Sprint 12 (2025-12-03 - 2025-12-17) GREI Re-arch Issues related to the GREI Dataverse rearchitecture Size: 10 A percentage of a sprint. 7 hours. SPA These changes are required for the Dataverse SPA

Projects

Status: Done 🧹

Development

Successfully merging this pull request may close these issues.

Add storage quotas on individual datasets

7 participants