-
Notifications
You must be signed in to change notification settings - Fork 15
Vendor-Configurable Metadata Models #294
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
satra
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
while this would work (there are several tests failing), how about get them from a config to be specific? also take a look at the jsonld context generator to see which ones map on specifically to dandiarchive. those would also need to be devendored.
what config do you have in mind since so far we have none -- typically we just resort to some env vars (that's what I was thinking we should do in the short run). The point is that it should be "instance" (not runtime) specific which is ATM not supported at all, as e.g. imagine within the same session uploading to both EMBER and DANDI, so then having either validation or ideally entire model per each such different instance type. |
|
perhaps we are saying the same thing in slightly different ways. dandischema can be devendored and instance agnostic (perhaps and hopefully). anything using it (server, cli, etc.,.) can make an instance of it. i.e. dandiset/asset for an instance of dandi with the right config. so introducing an instance config that can be/required to be passed on to initializing a model. for example, this could also allow EMBER to have different license considerations than DANDI. one possibility is perhaps to still consider a generic core model (e.g., union of licenses, more generic doi/devendored setup like this PR) and then an instance specific validation/schema/constraint/version layer that can be customized. for this PR, let's just simplify(expand) for now. i think some of the context adjustment will also need to happen (here: dandi-schema/dandischema/metadata.py Line 46 in 782c421
"dandi": "http://schema.dandiarchive.org/",
"dcite": "http://schema.dandiarchive.org/datacite/",
"dandiasset": "http://dandiarchive.org/asset/",
"DANDI": "http://dandiarchive.org/dandiset/",to this (note: the schema ones should not change for now at least as the are simple the model) "dandi": "http://schema.dandiarchive.org/",
"dcite": "http://schema.dandiarchive.org/datacite/",
"dandiasset": "http://dandiarchive.org/asset/",
"DANDI": "http://dandiarchive.org/dandiset/",
"emberasset": "http://emberarchive.org/asset/",
"EMBER": "http://emberarchive.org/dandiset/",a full refactor to dandi + instance concepts would require us to make some breaking changes and different identifiers. we could may be craft such a thing this week when we are together. |
|
@yarikoptic -- for LINC, we are currently not issuing DOIs (not using DANDI either) I have had placeholders in for the Django env vars expected -- I can certainly implement wherever EMBER goes for their own logic for LINC though (let me know if this makes sense @kabilar ) cc @satra |
Thanks @yarikoptic. I agree with Aaron. For LINC, we don't use DOIs or |
|
I think there is away to take a config and vendorize the models and enums accordingly even with the technologies we are currently using. Pydantic has a dynamic model creation syntax and |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #294 +/- ##
==========================================
+ Coverage 97.84% 97.89% +0.04%
==========================================
Files 16 18 +2
Lines 2042 2370 +328
==========================================
+ Hits 1998 2320 +322
- Misses 44 50 +6
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@yarikoptic I left a comment at #294 (comment). Otherwise, this PR is good to go. Feel free to squash the commits. There are small and serving as breadcrumbs for review purposes. One note I want to make is that this PR is not a complete devendorization since we default the |
c3c71f4 to
606e1b6
Compare
|
The two remaining tests failures must be unrelated to the current PR. The last time the two test succeeded was three months ago at https://github.com/dandi/dandi-schema/actions/runs/13537569388. They are worthy of further investigation though. |
6f12bca to
6832dab
Compare
|
Let's add testing for 3 scenarios in CI.
records we have which are not vendor specific which we have in For now let's just do
and then we will reconsider... |
|
here you could grab a sample from EMBER https://api-dandi.emberarchive.org/api/dandisets/000004/versions/draft/ but would still need to replace with EMBER: prefix since not vendorized version was used. also think on how to do migration now within EMBER to adjust all those IDs etc... |
…-name Workaround to allow `dandischema.conf.Config` to be initialized with field names
These vars, with the exception of `DJANGO_DANDI_DOI_PUBLISH`, need to be provided together or not at all per https://github.com/dandi/dandi-archive/blob/b7288ec920b0c2cf2efd9c6d43e4edbf5016885d/dandiapi/api/checks.py#L10-L25. Since LINC doesn't publish Dandiset with DOI, it is reasonable not to set these env var. Additionally, the value for `DJANGO_DANDI_DOI_API_PREFIX` before this commit doesn't pass the pattern requirement imposed for DOI prefix being introduced in dandi/dandi-schema#294.
These vars, with the exception of `DJANGO_DANDI_DOI_PUBLISH`, need to be provided together or not at all per https://github.com/dandi/dandi-archive/blob/b7288ec920b0c2cf2efd9c6d43e4edbf5016885d/dandiapi/api/checks.py#L10-L25. Since LINC doesn't publish Dandiset with DOI, it is reasonable not to set these env var. Additionally, the value for `DJANGO_DANDI_DOI_API_PREFIX` before this commit doesn't pass the pattern requirement imposed for DOI prefix being introduced in dandi/dandi-schema#294.
Set env vars need to set the vendor-specific dandi-schema introduced by dandi/dandi-schema#294
Set env vars need to set the vendor-specific dandi-schema introduced by dandi/dandi-schema#294
|
fresh conflict came up again |
Monkey-patches a `conf` module into `dandischema` for older version of `dandischema`. This allows dandi-archive to release without a coordinated release of dandi-schema in regard to the vendorization effort implemented in dandi/dandi-schema#294.
Monkey-patches a `conf` module into `dandischema` for older version of `dandischema`. This allows dandi-archive to release without a coordinated release of dandi-schema in regard to the vendorization effort implemented in dandi/dandi-schema#294.
Put in a tox env to test vendor-configurable dandischema located at dandi/dandi-schema#294. This env can be dropped once dandi/dandi-schema#294 is merged to master and released
Put in a tox env to test vendor-configurable dandischema located at dandi/dandi-schema#294. This env can be dropped once dandi/dandi-schema#294 is merged to master and released
# Conflicts: # dandischema/tests/test_models.py
# Conflicts: # dandischema/models.py
|
@yarikoptic Merge conflicts resolved again. Please take a final look and merge and release. |
|
On a quick look - looks good. I would still though not yet merge/release and aim to finalize/merge first with added dandi-cli flexibility in handling versions... so we have minor change to dandi-schema released first before we jump this one . And then (may be not right away to avoid conflicts) here we might want to add
|
|
ok, I am merging this into master and we will release it by |
to accomodate LINC and EMBER. We are yet to workout proper vendorization where a particular model, ideally not entire run time as we could do via env var, could be parametrized with a specific vendor information. I think that relaxing those regular expressions altogether would help to adopt dandi for custom deployments meanwhile.
@satra -- WDYT for a quick workaround like this? any side effects I am not foreseeing besides "looser" validation?
Also attn @aaronkanzer and @kabilar -- for LINC did you just keep DANDI identifiers?
DONEs:
DANDI_NSKEYinternal variableAssumptions made in this PR:
"EMBER-DANDI"as instance name. (This assumption is used in writing test cases only. So, nothing will break anything if it is wrong.)10.60533(and10.82754for testing). (This assumption is used in writing test cases only. So, nothing will break anything if it is wrong.)DANDI_DOI_PATTERNindandischema.models.py. WithID_PATTERNset to"EMBER"in running the DANDI instance, an example of a suffix would be"ember.001425/0.250514.0602"Incidental changes:
test_dantimeta_1()totest_dandimeta_1()test.ymlto reduce code duplication.DANDI_DOI_PATTERNDANDI_PUBID_PATTERN_basic_publishmeta()intests/utils.pytobasic_publishmeta()TODOs:
jsonschemapackage. Make sure the issue is resolved or adopt a temporary solution.Release Notes
migrate()andvalidate()are not available atdandischemalevel. To use them, one must dofrom dandischema.metadata import migrate, validateinstead offrom dandischema import migrate, validate.