Skip to content

Comments

Metadata 4.7 updates for lupo#1450

Open
svogt0511 wants to merge 39 commits intomasterfrom
metadata-47
Open

Metadata 4.7 updates for lupo#1450
svogt0511 wants to merge 39 commits intomasterfrom
metadata-47

Conversation

@svogt0511
Copy link
Contributor

@svogt0511 svogt0511 commented Jan 16, 2026

Purpose

Changes to lupo as required for the Metadata 4.7 release, described in the document at the link, below.

Metadata Schema Release 4.7 - Product Spec

closes: Add github issue that originated this PR

Approach

Open Questions and Pre-Merge TODOs

Learning

Types of changes

  • Bug fix (non-breaking change which fixes an issue)

  • New feature (non-breaking change which adds functionality)

  • Breaking change (fix or feature that would cause existing functionality to change)

Reviewer, please remember our guidelines:

  • Be humble in the language and feedback you give, ask don't tell.
  • Consider using positive language as opposed to neutral when offering feedback. This is to avoid the negative bias that can occur with neutral language appearing negative.
  • Offer suggestions on how to improve code e.g. simplification or expanding clarity.
  • Ensure you give reasons for the changes you are proposing.

Summary by CodeRabbit

  • New Features

    • Support for DataCite schema 4.7 relation metadata (relationTypeInformation) in parameters, indexing, and filtering.
  • Chores

    • Bumped dependencies: bolognese to 2.5.1 and shoryuken to 7.0.1.
  • Tests

    • Added full DataCite 4.7 fixture, VCR cassettes, and extensive 4.7 request specs (includes some duplicated test blocks).

@coderabbitai
Copy link

coderabbitai bot commented Feb 11, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds relationTypeInformation to params sanitizer and DOI index mappings; bumps bolognese and shoryuken gem versions; adds a DataCite 4.7 XML fixture and VCR cassettes; extends/duplicates schema 4.7 request specs; and makes minor formatting edits to development and docker-compose files.

Changes

Cohort / File(s) Summary
Gemfile updates
Gemfile, Gemfile.lock
Bumps bolognese ~> 2.5.0~> 2.5.1 and shoryuken ~> 7.0~> 7.0.1 in two locations (dependency version updates).
Params & index mapping
app/lib/params_sanitizer.rb, app/models/doi.rb
Adds relationTypeInformation to relatedIdentifiers and relatedItems mappings in the params sanitizer and the DOI Elasticsearch/index schema (schema/attribute additions only).
Dev config formatting
config/environments/development.rb
Removed an empty line between adjacent config statements; no behavioral change.
Compose formatting
docker-compose.yml
Removed stray blank lines around the elasticsearch service and EOF; no functional changes.
DataCite 4.7 fixture
spec/fixtures/files/datacite-example-full-v4.7.xml
Adds a comprehensive DataCite kernel-4.7 XML fixture covering identifiers, creators, titles, relatedIdentifiers, relatedItems, geo, funding, descriptions, and nested related-item metadata.
VCR cassettes
spec/fixtures/vcr_cassettes/.../when_the_request_uses_schema_4_7_-_json/creates_a_Doi.yml, spec/fixtures/vcr_cassettes/.../when_the_request_uses_schema_4_7_-_xml/creates_a_Doi.yml
Adds two VCR cassette YAML files recording PUT interactions to the handle API for schema 4.7 DOI creation (200 OK responses recorded, duplicate interactions present).
Request specs (tests)
spec/requests/datacite_dois/post_spec.rb, spec/requests/datacite_dois/patch_spec.rb, spec/requests/datacite_dois/datacite_dois_spec.rb
Adds extensive schema 4.7 tests for XML and JSON inputs (deep response and embedded XML assertions), duplicates some test blocks, and introduces filtering tests for relationTypeInformation; also adds a top-level require "pp" in a spec.
Fixtures / test artifacts
spec/fixtures/...
Large fixture and cassette additions (+ many lines) supporting schema 4.7 test coverage (static data only).

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Metadata 4.7 updates for lupo' accurately reflects the main objective of the PR, which implements changes required for DataCite Metadata Schema Release 4.7 across multiple files and components.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Merge Conflict Detection ✅ Passed ✅ No merge conflicts detected when merging into master
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch metadata-47

Tip

Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@spec/requests/datacite_dois/post_spec.rb`:
- Around line 2485-2522: The test payload mixes symbol-style keys (e.g.,
"types":, "relatedItems":, "relatedItemIdentifier":) with string hash-rocket
keys, causing inconsistent key types; update the entries for types,
relatedItems, and relatedItemIdentifier to use the hash-rocket/string-key syntax
(same as the other keys, e.g., "types" => { ... }, "relatedItems" => [ ... ],
"relatedItemIdentifier" => { ... }) so the payload uses consistent string keys
throughout (locate the occurrences of types, relatedItems, and
relatedItemIdentifier in the failing spec and replace the colon-style keys with
=>).
🧹 Nitpick comments (4)
app/lib/params_sanitizer.rb (1)

168-182: Pre-existing duplicate keys in relatedIdentifiers — consider cleaning up.

relatedMetadataScheme, schemeUri, and schemeType each appear twice in this %i list (lines 174–176 and 179–181). The new relationTypeInformation addition (line 178) is correct, but it was inserted just before these existing duplicates, making them more visible. While duplicate symbols in a permit list are harmless in Rails, removing them would reduce confusion.

♻️ Suggested cleanup
     {
       relatedIdentifiers: %i[
         relatedIdentifier
         relatedIdentifierType
         relationType
         relatedMetadataScheme
         schemeUri
         schemeType
         resourceTypeGeneral
         relationTypeInformation
-        relatedMetadataScheme
-        schemeUri
-        schemeType
       ],
     },
app/models/doi.rb (1)

303-308: New relationTypeInformation field mappings look correct.

Using type: :text is appropriate for free-form descriptive content. Both related_identifiers and related_items are consistently updated, aligned with the params sanitizer changes.

Operational note: Adding a new field to the ES mapping requires a reindex (or at minimum a mapping update via PUT /_mapping) on existing indices. Ensure this is part of your deployment plan.

spec/requests/datacite_dois/patch_spec.rb (1)

1001-1038: Test context is outside the describe "PATCH /dois/:id" block and has a duplicate name.

Two issues with this new test block:

  1. Structural placement: The describe "PATCH /dois/:id" block ends at line 781, so this new context (line 1004) sits as a sibling rather than being nested within it. It still performs patch "/dois/#{doi.doi}", so it logically belongs inside that describe block.

  2. Duplicate context name: "when the record exists" (line 1004) duplicates the context name at line 21, which will produce confusing test output. Consider renaming to something like "when the request uses schema 4.7 - xml patch".

♻️ Suggested restructuring

Move the block inside the describe "PATCH /dois/:id" block (before its end at line 781) and rename the context:

-  # Metadata 4.7 - elements
-
-  context "when the record exists" do
+  context "when the request uses schema 4.7 - xml" do
     let(:xml) { Base64.strict_encode64(file_fixture("datacite-example-full-v4.7.xml").read) }

And relocate this block to be inside describe "PATCH /dois/:id".

spec/requests/datacite_dois/post_spec.rb (1)

2619-2619: Commented-out assertion — consider removing or replacing.

This commented-out expectation is presumably stale because the payload now includes multiple relatedIdentifiers. Either remove it or replace with a working assertion (e.g., checking doc.css("relatedIdentifier").map(&:content)).

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@spec/requests/datacite_dois/post_spec.rb`:
- Line 2619: Uncomment or document the commented-out XML assertion for
relatedIdentifiers in the spec (the line with
expect(doc.at_css("relatedIdentifiers").content).to eq("10.5438/55e5-t5c0")): if
the XML generation is correct, re-enable the assertion to restore coverage; if
it was disabled because relationTypeInformation/RAiD changes break the expected
output, add a TODO comment explaining the reason, link to the issue/PR that
tracks the fix, and include a TODO tag with when to re-enable so tests
explicitly document the known gap instead of silently reducing coverage.
🧹 Nitpick comments (1)
spec/requests/datacite_dois/post_spec.rb (1)

2430-2475: Schema 4.7 XML test context looks good overall.

The test validates key 4.7 elements well: new resourceTypeGeneral values (Poster, Presentation), RAiD identifier type, relationTypeInformation, SWHID identifier type, and relationType "Other".

One concern: the assertions rely on hard-coded array indices (36, 37, 38, 40) into relatedIdentifiers, which are brittle — any change in the fixture file's ordering or number of entries will silently shift indices and cause false failures or pass incorrectly. Consider asserting via find/select on a distinguishing field value, or at minimum adding a comment noting which fixture entry each index corresponds to.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In `@spec/requests/datacite_dois/datacite_dois_spec.rb`:
- Line 2249: Update the RSpec example description string in the it block named
"finds the dois based on relatedIdentiers.relationTypeInformation" to correct
the typo: change "relatedIdentiers" to "relatedIdentifiers" so the description
reads "finds the dois based on relatedIdentifiers.relationTypeInformation";
locate the it block in spec/requests/datacite_dois/datacite_dois_spec.rb and
edit the string accordingly.
- Around line 2184-2186: Remove the extra blank line at the start of the block
body for the describe "GET /dois/query=...relationTypeInformation" example and
delete the trailing whitespace after the let!(:datacite_doi) line; specifically,
edit the block containing describe "GET /dois/query=...relationTypeInformation"
and the let!(:datacite_doi) { create(:doi, client: client, aasm_state:
"findable", ... } declaration to ensure no empty line immediately after the
block header and no trailing spaces at the end of the let! line.
- Line 2: Remove the debug require by deleting the line `require "pp"` from
spec/requests/datacite_dois/datacite_dois_spec.rb and ensure there is a blank
line after the frozen string literal comment (if present) to satisfy RuboCop's
Layout/EmptyLineAfterMagicComment; no other changes are needed since `pp` is in
the stdlib.
🧹 Nitpick comments (2)
spec/requests/datacite_dois/datacite_dois_spec.rb (2)

2241-2255: Tests only cover the happy path with a single matching DOI — consider adding a non-matching record.

Both tests create only one DOI that contains relationTypeInformation. The total == 1 assertion doesn't prove the query actually filters on the new field — it would pass even if the query matched everything. Adding a second DOI without relationTypeInformation would make the assertions meaningful by confirming the unmatched record is excluded.


2247-2249: Remove extra blank line.

There's a superfluous blank line between the two it blocks (line 2248). Minor nit for consistency with the rest of the file.

@svogt0511 svogt0511 requested a review from jrhoads February 16, 2026 17:49
Copy link
Contributor

@jrhoads jrhoads left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good. Minor issues picked up by coderabbit. Misspelling in the spec description and a left-over pp.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants