Skip to content

Discussion: Consider modifications to the BrAPI genotyping specification to support additional information related to Variants #639

@kadreher

Description

@kadreher

During the 2025 BrAPI hackathon, we initiated a discussion regarding additional information that could potentially be useful to transfer via BrAPI for Variants and welcome feedback from the community.

Various members of the community have ways to distinguish between an underlying polymorphism in the genomes of a species and the methods used to detect that polymorphism. The current BrAPI specification for variant may include elements of these two different levels. However, there may be a one-to-many relationship between the polymorphism and how it can be detected.

The EBS currently uses "marker" to describe the genotypic variation and "assay" to describe the assay used to detect the polymorphism.

Breedbase uses "protocol" to help distinguish the different means of detecting the same genomic difference.

Other systems may use other terms like "instance."

At the level of assay/protocol, some relevant characteristics would be:
assay/protocol name
assay/protocol ID
technology
technology provider
conditions (e.g. enzymes or primes used)
strand of the genome used in reporting the result

In the case of the assay, each polymorphism would be associated with a different assay.
In the case of a protocol, I am not sure if the same protocol (e.g. enzymes used) could be associated with many markers

In one data model, it seems that the reference genome used to "call" the position of the polymorphism could be part of the protocol specification
In another data model, the "marker" can be associated in a one-to-many relationship with a map-position so it would not be an inherent feature of the marker or the assay/protocol level.

One potential option for handling the more granular information would be to create a new level, e.g. assay and have a markerDbID as one of its attributes.

Another potential option would be to add the new fields to the variant level (e.g. technology) and then to add a text value like "markerName" such that multiple variants could share the same markerName,

A practical example of this using the Marker and Assay terminology would be:

Marker = S1_0202020_Ref_v4.2

Assay 1 = M00002 - with technology A from technology provider X
Assay 2 = P888 - with technology B from technology provider Y

The data returned from X would have M00002 and the data returned from Y would have P888.

At some point, people may want to combine those data together to and report the results for S1_0202020_Ref_v4.2.

These levels of information may also have implications for storing data for QTLs/haplotypes/marker groups, which will be the subject of another discussion issue.

We would be grateful for feedback from the community on idea of expanding the BrAPI specification to include this type of information.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions