Skip to content

[MAINTENACE] Deprecate ror-data repository #359

@adambuttrick

Description

@adambuttrick

Context

https://github.com/ror-community/ror-data currently stores historical data dump zip files committed by the generate_dump.yml workflow. Once all workflows and services have been migrated to use release artifacts on https://github.com/ror-community/ror-records, this repository should be archived.

Current state

  • ror-data stores historical data dump zip files committed by generate_dump.yml
  • Referenced by:
    • generate_dump.yml (reads previous dump, writes new dump)
    • publish_dump_zenodo.yml (reads dump for Zenodo upload)
    • ror-api getrordump.py (downloads dump for indexing)
  • Also referenced: ror-data-test repo used for test indexing

Steps

  1. Verify all workflows have been migrated and tested:

    • generate_dump.yml reads from and writes to release artifacts
    • publish_dump_zenodo.yml reads from release artifacts
    • ROR API downloads from release artifacts
    • Indexing workflows pass correct parameters
    • curation_ops scripts are compatible
  2. Add README to ror-data explaining:

    • Historical dumps are preserved in this repo
    • New dumps are published as release assets on ror-records
    • Link to the ror-records releases page
  3. Archive the repository:

    • Settings -> Archive this repository
  4. Decide on ror-data-test:

    • Archive it as well, or repurpose it for testing the release artifact workflow

Acceptance criteria

  • All workflows confirmed working without ror-data in both staging and production
  • End-to-end test completed: generate dump -> publish to Zenodo -> index in staging -> index in production
  • README added to ror-data explaining the migration and pointing to ror-records
  • ror-data repository archived
  • Decision made and executed on ror-data-test repository
  • No broken references to ror-data remain in any active workflow or service

Metadata

Metadata

Assignees

No one assigned

    Labels

    maintenanceWork needed to maintain long-term health/performance of code and infrastructure

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions