Skip to content

Conversation

@dejanb
Copy link
Contributor

@dejanb dejanb commented Dec 18, 2025

Proposes onDuplicate parameter with three modes (ingest, ignore, replace) to handle SBOMs with duplicate document identifiers.

Assisted-by: Claude

Summary by Sourcery

Documentation:

  • Add an ADR describing configurable SBOM duplicate handling modes (ingest, ignore, replace) and how they are applied via API uploads and importer configuration.

Proposes onDuplicate parameter with three modes (ingest, ignore, replace)
to handle SBOMs with duplicate document identifiers.

Assisted-by: Claude
Signed-off-by: Dejan Bosanac <dbosanac@redhat.com>
@sourcery-ai
Copy link
Contributor

sourcery-ai bot commented Dec 18, 2025

Reviewer's Guide

Adds a new ADR describing a configurable onDuplicate parameter with three handling modes (ingest, ignore, replace) for SBOMs with duplicate document identifiers, including API and importer configuration and high-level implementation scope.

Sequence diagram for SBOM upload with configurable onDuplicate handling

sequenceDiagram
    actor Client
    participant ApiV2 as ApiV2SbomEndpoint
    participant Ingestor as IngestorService
    participant Graph as GraphLayer
    participant DB as Database
    participant Store as SbomStorage

    Client->>ApiV2: POST /api/v2/sbom (sbom, onDuplicate)
    ApiV2->>Ingestor: ingest(sbomContent, onDuplicate, source, metadata)

    Ingestor->>Graph: get_sbom_by_document_id(documentId)
    Graph-->>Ingestor: existingSbom or null

    alt onDuplicate = ingest or existingSbom is null
        Ingestor->>Store: saveSbomBlob(sbomContent)
        Store-->>Ingestor: storageLocation
        Ingestor->>DB: insertSbomRecord(documentId, hash, storageLocation)
        DB-->>Ingestor: sbomRecord
        Ingestor-->>ApiV2: NewSbomInfo
        ApiV2-->>Client: 201 Created
    else onDuplicate = ignore and existingSbom not null
        Ingestor-->>ApiV2: ExistingSbomInfo
        ApiV2-->>Client: 200 OK (duplicate ignored)
    else onDuplicate = replace and existingSbom not null
        Ingestor->>Store: deleteSbomBlob(existingSbom.storageLocation)
        Ingestor->>DB: deleteSbomRecord(existingSbom.id)
        DB-->>Ingestor: deleted
        Ingestor->>Store: saveSbomBlob(sbomContent)
        Store-->>Ingestor: storageLocation
        Ingestor->>DB: insertSbomRecord(documentId, hash, storageLocation)
        DB-->>Ingestor: sbomRecord
        Ingestor-->>ApiV2: NewSbomInfo
        ApiV2-->>Client: 200 OK (replaced)
    end
Loading

Sequence diagram for importer using onDuplicate behavior

sequenceDiagram
    participant Scheduler
    participant Importer as SbomImporter
    participant Remote as SbomSource
    participant Ingestor as IngestorService
    participant Graph as GraphLayer
    participant DB as Database
    participant Store as SbomStorage

    Scheduler->>Importer: triggerImport()
    Importer->>Remote: fetchSbomList()
    Remote-->>Importer: sbomReferences

    loop for each sbomReference
        Importer->>Remote: fetchSbom(sbomReference)
        Remote-->>Importer: sbomContent
        Importer->>Ingestor: ingest(sbomContent, config.onDuplicate, importerName, metadata)

        Ingestor->>Graph: get_sbom_by_document_id(documentId)
        Graph-->>Ingestor: existingSbom or null

        alt config.onDuplicate = ignore and existingSbom not null
            Ingestor-->>Importer: ExistingSbomInfo (skipped)
        else config.onDuplicate = replace and existingSbom not null
            Ingestor->>Store: deleteSbomBlob(existingSbom.storageLocation)
            Ingestor->>DB: deleteSbomRecord(existingSbom.id)
            DB-->>Ingestor: deleted
            Ingestor->>Store: saveSbomBlob(sbomContent)
            Store-->>Ingestor: storageLocation
            Ingestor->>DB: insertSbomRecord(documentId, hash, storageLocation)
            DB-->>Ingestor: sbomRecord
            Ingestor-->>Importer: NewSbomInfo (replaced)
        else other cases
            Ingestor->>Store: saveSbomBlob(sbomContent)
            Store-->>Ingestor: storageLocation
            Ingestor->>DB: insertSbomRecord(documentId, hash, storageLocation)
            DB-->>Ingestor: sbomRecord
            Ingestor-->>Importer: NewSbomInfo (ingested)
        end
    end
Loading

Class diagram for configurable onDuplicate handling

classDiagram
    class ApiV2SbomEndpoint {
        +uploadSbom(requestBody, onDuplicateQuery, sourceHeader)
        +parseOnDuplicate(onDuplicateQuery) OnDuplicateMode
    }

    class ImporterConfigEndpoint {
        +createSbomImporter(name, source, onDuplicate, period)
        +updateSbomImporter(name, source, onDuplicate, period)
    }

    class SbomImporter {
        +string name
        +string source
        +OnDuplicateMode onDuplicate
        +string period
        +run()
        +fetchSbomReferences()
        +fetchSbom(reference)
    }

    class IngestorService {
        +ingest(sbomContent, onDuplicate, source, metadata) SbomInfo
        -handleIngestMode(sbomContent, documentId, source, metadata) SbomInfo
        -handleIgnoreMode(existingSbom) SbomInfo
        -handleReplaceMode(existingSbom, sbomContent, documentId, source, metadata) SbomInfo
    }

    class GraphLayer {
        +get_sbom_by_document_id(documentId) SbomRecord
    }

    class SbomRecord {
        +string id
        +string documentId
        +string hash
        +string storageLocation
        +string format
    }

    class SbomInfo {
        +string id
        +string documentId
        +string status
        +string modeApplied
    }

    class OnDuplicateMode {
        <<enumeration>>
        ingest
        ignore
        replace
    }

    class Database {
        +insertSbomRecord(documentId, hash, storageLocation, format) SbomRecord
        +deleteSbomRecord(id)
    }

    class SbomStorage {
        +saveSbomBlob(sbomContent) string
        +deleteSbomBlob(storageLocation)
    }

    ApiV2SbomEndpoint --> IngestorService : uses
    ImporterConfigEndpoint --> SbomImporter : configures
    SbomImporter --> IngestorService : calls ingest
    IngestorService --> GraphLayer : queries
    IngestorService --> Database : writes
    IngestorService --> SbomStorage : manages blobs
    GraphLayer --> SbomRecord : returns
    Database --> SbomRecord : persists
    IngestorService --> SbomInfo : returns
    SbomImporter --> OnDuplicateMode
    ApiV2SbomEndpoint --> OnDuplicateMode
    IngestorService --> OnDuplicateMode
Loading

File-Level Changes

Change Details Files
Document ADR for configurable SBOM duplicate handling behavior using an onDuplicate parameter with ingest/ignore/replace modes.
  • Describe current hash-based deduplication and limitations with stable SBOM document identifiers (SPDX documentNamespace, CycloneDX serialNumber).
  • Define three duplicate handling modes (ingest, ignore, replace) and their intended semantics and default behavior.
  • Specify configuration options for onDuplicate at API upload (query parameter) and importer configuration (per-importer field).
  • Outline how duplicate detection works via document identifier lookup and the high-level responsibilities of core components (ingestor service, graph layer, API endpoints, importer config).
  • Capture benefits, logging/atomicity considerations, and an open question about preserving user-added labels on replace.
docs/adrs/00013-configurable-sbom-duplicate-handling.md

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes - here's some feedback:

  • The ADR filename is 00013-... but the heading uses # 00011. ...; consider aligning the ADR number in the title with the filename/sequence for consistency.
  • It would be useful to clarify how onDuplicate should behave when an SBOM is missing a documentNamespace/serialNumber or has an invalid/blank identifier (e.g., fallback to hash-based behavior or treat as ingest).
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The ADR filename is `00013-...` but the heading uses `# 00011. ...`; consider aligning the ADR number in the title with the filename/sequence for consistency.
- It would be useful to clarify how `onDuplicate` should behave when an SBOM is missing a `documentNamespace`/`serialNumber` or has an invalid/blank identifier (e.g., fallback to hash-based behavior or treat as `ingest`).

## Individual Comments

### Comment 1
<location> `docs/adrs/00013-configurable-sbom-duplicate-handling.md:1` </location>
<code_context>
+# 00011. Configurable SBOM Duplicate Handling
+
+## Status
</code_context>

<issue_to_address>
**issue (typo):** ADR number in the title does not match the filename and may be confusing.

The file is named `00013-...` but the ADR title starts with `00011.` Please update the heading number to match the filename, or add a brief note if the mismatch is intentional, to avoid confusion when referencing this ADR.

```suggestion
# 00013. Configurable SBOM Duplicate Handling
```
</issue_to_address>

### Comment 2
<location> `docs/adrs/00013-configurable-sbom-duplicate-handling.md:48-50` </location>
<code_context>
+Add optional `onDuplicate` query parameter to SBOM upload endpoint:
+
+```bash
+# Ignore duplicates - skip if already exists
+cat sbom.json | http POST localhost:8080/api/v2/sbom onDuplicate=ignore
+
</code_context>

<issue_to_address>
**nitpick (typo):** Minor grammar tweak: add "it" to read more naturally.

Suggest rephrasing this line to `# Ignore duplicates - skip if it already exists` for smoother readability.

```suggestion
Add optional `onDuplicate` query parameter to SBOM upload endpoint:

```bash
# Ignore duplicates - skip if it already exists
cat sbom.json | http POST localhost:8080/api/v2/sbom onDuplicate=ignore
```
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@@ -0,0 +1,118 @@
# 00011. Configurable SBOM Duplicate Handling
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (typo): ADR number in the title does not match the filename and may be confusing.

The file is named 00013-... but the ADR title starts with 00011. Please update the heading number to match the filename, or add a brief note if the mismatch is intentional, to avoid confusion when referencing this ADR.

Suggested change
# 00011. Configurable SBOM Duplicate Handling
# 00013. Configurable SBOM Duplicate Handling

Comment on lines +48 to +50
Add optional `onDuplicate` query parameter to SBOM upload endpoint:

```bash
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick (typo): Minor grammar tweak: add "it" to read more naturally.

Suggest rephrasing this line to # Ignore duplicates - skip if it already exists for smoother readability.

Suggested change
Add optional `onDuplicate` query parameter to SBOM upload endpoint:
```bash
Add optional `onDuplicate` query parameter to SBOM upload endpoint:
```bash
# Ignore duplicates - skip if it already exists
cat sbom.json | http POST localhost:8080/api/v2/sbom onDuplicate=ignore

@codecov
Copy link

codecov bot commented Dec 18, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 68.24%. Comparing base (648d488) to head (b671057).

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #2188   +/-   ##
=======================================
  Coverage   68.24%   68.24%           
=======================================
  Files         376      376           
  Lines       21208    21208           
  Branches    21208    21208           
=======================================
  Hits        14473    14473           
+ Misses       5868     5864    -4     
- Partials      867      871    +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

@ruromero ruromero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with the suggested approach. I could suggest adding versioning to the existing solutions in case you want to add more complex capabilities but the suggested ones add enough flexibility

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants