-
Notifications
You must be signed in to change notification settings - Fork 7
Rework sequence querying & updating #7169
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ed queries for no-DataId data classes and no-RowId sample types. Clear deleted samples from assay wells.
…query source DB for object IDs, etc., use pluralize() when logging counts.
…xp.Data & exp.Object rows.
…calls getContainerClause() calls getContainerFieldKey()
labkey-jeckels
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See concern about DataId vs LSID
| super(DbSchema.get(AbstractTsvAssayProvider.ASSAY_SCHEMA_NAME, DbSchemaType.Provisioned)); | ||
| } | ||
|
|
||
| // Provisioned assay result tables occasionally have no DataId column; hopefully they have an LSID column. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If they have an LSID, I think it would be for the assay result row and not an exp.data row. Do we know anything about these domains without a DataId? Are they very old?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@labkey-jeckels it's a single assayresult provisioned table named c5d21781_assayplatereplicatestats (all other provisioned tables in the schema end with _data_fields). The table is empty, making it impossible to verify via data. It doesn't look old... the domain ("AssayPlateReplicateStats", unsurprisingly) and properties (18 of them) were all created 7/29/2025.
I implemented this guess as a precaution, and you likely have a better guess than I. I could just log a warning about this case and skip it. Let's discuss.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I see that this is associated with PlateReplicateStatsDomainKind. I'll take a look at that class to understand how it's hooked up to everything else.
Rationale
The previous sequences metadata query was flawed: it joined on sequence name (which is not globally unique) without qualifying by schema name. That produced confusing row duplication in cases where multiple serial columns existed on a single table (e.g., etltest.Source). Switch dialect method to return a collection of sequences since a given table could have more than one.
Add copying of assay results.
Correct Sample Type's delete to pass a list of ObjectIds (it was passing sample IDs!). Clear deleted samples from assay wells.
Fix our handling of data classes that lack a DataId column and sample types that lack a RowId.
Attempt using a small batch size for core.Documents to speed up copy.
ExperimentDeleteService allows assay results handler to delete from exp tables.
Refactor the filtering method hierarchy and get rid of
DUMMY_FIELD_KEY. Previously, getTableSelector() would call getContainerFieldKey() and pass the result through getTableFilterClause() and getContainerClause(). This was awkward and forced the use of DUMMY_FIELD_KEY to ensure getContainerClause() overrides were called. A much more sensible approach: getTableSelector() now calls getTableFilterClause() which calls getContainerClause() which calls getContainerFieldKey(). Handlers now override at whatever level is appropriate and the dumb marker object is no longer needed.Delete not copied MaterialIds from inventory.Item.