-
Notifications
You must be signed in to change notification settings - Fork 7
Rework attachment handling during database migrations #7191
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ernal data source. Adjust the attachment copying code to use this new mechanism.
…pecimen requests.
labkey-jeckels
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a fair amount of edits in codepaths beyond migration. Hopefully automated tests will give good coverage without needing separate manual testing?
experiment/src/org/labkey/experiment/ExperimentMigrationSchemaHandler.java
Show resolved
Hide resolved
The changes (outside migration) are primarily adjusting AttachementTypes, right? These aren't really used, other than one code path (that I've tested multiple times during this development). I doubt there's any automated testing of the admin page that uses them. My leveraging AttachmentType here means they've gotten more use than ever and I've fixed several issues that no one noticed. |
Rationale
Repackage API classes into
org.labkey.api.migrationThe
Documentstable lives incorebut is effectively owned and managed by many different modules. The previous approach to migrating this table attempted to copy all attachments at once, which accommodated container filtering and (barely) the global attachment approach of Labbook, but not domain row and other complex filtering. As a result, we copied far too many rows that shouldn't have ended up in the cloned database and wasted hours of copying time.This new approach makes each schema handler responsible for copying the documents it manages, immediately after its tables have been populated. The schema handlers can usually delegate all the work to the existing
AttachmentTypeimplementations, which already know how to retrieve parent EntityIds for each type. I've adjusted AttachmentType slightly to make it more flexible (supporting the original use case where we selected EntityIds from the documents table and the new use case where we select those EntityIds from the populated tables instead).Also,
TempTableInClauseGeneratorwas hard-coded to create temp tables in the LabKey data source temp schema, which of course fails when using a large IN clause against an external data source. Teach TempTableInClauseGenerator how to use a different temp schema and introduce aSqlDialect.appendInClauseSql()variant that takes a customized InClauseGenerator for this purpose. The attachment copying code collects potentially large sets of EntityIds and RowIds from the target tables, then uses this mechanism to select the appropriate rows from the source database's core.Documents table.