-
Notifications
You must be signed in to change notification settings - Fork 4
Closed
Labels
maintenanceWork needed to maintain long-term health/performance of code and infrastructureWork needed to maintain long-term health/performance of code and infrastructure
Description
Service/repository
- ror-community/ror-api
- ror-community/ror-records (GitHub Actions workflows)
Describe the current state/issue
With the deprecation of schema v1, ROR data dumps now contain only a single JSON file in v2 format. However, the indexing code and associated Github action workflows still expect the old format with two files (v1 and v2).
Old data dump format:
v1.73-2025-10-28-ror-data/
├── v1.73-2025-10-28-ror-data.json
├── v1.73-2025-10-28-ror-data.csv
├── v1.73-2025-10-28-ror-data_schema_v2.json
└── v1.73-2025-10-28-ror-data_schema_v2.csv
New data dump format (v2.0 onwards):
v2.0-2025-12-16-ror-data/
├── v2.0-2025-12-16-ror-data.json
└── v2.0-2025-12-16-ror-data.csv
The current indexrordump.py uses filename pattern matching to detect schema version:
- Files with
schema_v2in filename are used for the v2 index - Files without
schema_v2in filename are used v1 index
This logic breaks with v2.0-2025-12-16-ror-data.json, which does not contain schema_v2 in the filename, causing it to be incorrectly treated as the v1 data dump.
Describe the desired state/solution
Complete removal of v1 indexing support from both repositories.
ror-api tasks:
-
indexrordump.py#L15-L22: Removeget_nested_names_v1function -
indexrordump.py#L28-L37: Removeget_nested_ids_v1function -
indexrordump.py#L139-L152: Update file detection logic -
indexrordump.py#L93-L99: Remove v1 index handling inindex_dump()function -
setup.py#L39: Remove schema choice1from-s, --schemaargument -
setup.py: Update help text to reflect v2-only indexing -
indexror.py#L16-L23: Removeget_nested_names_v1function -
indexror.py#L29-L38: Removeget_nested_ids_v1function -
indexror.py#L155-L156: Remove v1 conditional inindex()function -
indexror.py#L186-L192: Remove v1 nested field generation inindex()function -
settings.py: Assess ifINDEX_v1andINDEX_TEMPLATE_ES7_v1settings can be removed -
rorapi/v1/: Assess v1 directory contents for removal (index templates, serializers)
ror-records workflow tasks:
-
prod_manual_index.yml: Removev1from schema-version choices -
prod_manual_index.yml: Remove v1 API URL conditional logic -
staging_manual_index.yml: Removev1from schema-version choices -
staging_manual_index.yml: Remove v1 validation schema path -
staging_manual_index.yml: Remove v1 API URL conditional logic -
prod_index_dump.yml: Removev1from schema-version choices -
prod_index_dump.yml: Remove v1 API URL conditional logic -
staging_index_dump.yml: Removev1from schema-version choices -
staging_index_dump.yml: Remove v1 API URL conditional logic
Testing tasks:
- Test all changes in
dev-ror-recordsenvironment first - Verify indexing works with new v2.0 data dump format
- Verify staging workflows function correctly
- Verify production workflows function correctly
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
maintenanceWork needed to maintain long-term health/performance of code and infrastructureWork needed to maintain long-term health/performance of code and infrastructure
Type
Projects
Status
Complete