-
Notifications
You must be signed in to change notification settings - Fork 14
Multi-analysis specification with TreatmentPatternsModule #273
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multi-analysis specification with TreatmentPatternsModule #273
Conversation
|
The script now checks that the results folder exists, reads completed analysis IDs from any passed_pathway_runs.csv files, and then runs only the treatment patterns that have not yet been executed. The work folder is updated with two files: passed_pathway_runs.csv and failed_pathway_runs.csv. |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## develop #273 +/- ##
===========================================
+ Coverage 97.82% 99.03% +1.20%
===========================================
Files 16 16
Lines 4193 4748 +555
===========================================
+ Hits 4102 4702 +600
+ Misses 91 46 -45 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Update TreatmentPatternsModule.Rd
* fixed data types and added cdm_version_concept_id * missed col value * target_id as int * changed yes/no
Removed duplicate table in treatmentPatternsRdms.csv
2357413 to
33e2f41
Compare
ccf6270 to
c520b3b
Compare
c520b3b to
09481c8
Compare
anthonysena
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove the createAnalysisSpecification function - I think all that is needed here is the createModuleSpecification changes (that appear to be in createAnalysisSpecification) and the multi analysis function. Reach out with any questions or concerns.
anthonysena
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@SivamshIndukuri a few additional comments based on a deeper review. Let me know if you have any questions or concerns. Thanks!
R/Module-TreatmentPatterns.R
Outdated
| resultAppend <- FALSE | ||
| passedPathways <- integer(0) | ||
|
|
||
| if (file.exists(file.path(workFolder, "passed_pathway_runs.csv")) && dir.exists(resultsFolder)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more addition to the logic here: we should consider the value of the incremental flag of the executionSettings: https://ohdsi.github.io/Strategus/reference/createCdmExecutionSettings.html#arg-incremental. If executionSettings$incremental == FALSE, we should remove any info on prior runs (passed and failed) and re-run from scratch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding executionSettings$incremental to the 'if condition' should resolve this. It will act like 0 pathways passed and resultAppend will be false.
R/Module-TreatmentPatterns.R
Outdated
| targets <- cohorts[cohorts$type == "target", c("cohortName", "cohortId"), drop = FALSE] | ||
| colnames(targets) <- c("target_cohort_name", "target_cohort_id") | ||
|
|
||
| events <- cohorts[cohorts$type == "event", c("cohortName", "cohortId"), drop = FALSE] | ||
| colnames(events) <- c("event_cohort_name", "event_cohort_id") | ||
|
|
||
| cohortAnalysisTable <- merge(targets, events, by = NULL) %>% dplyr::mutate("analysis_id" = analysisId) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cohortAnalysisTable provides the combination of target cohorts, event cohorts, exit cohorts and the current analysisId. cohorts already contains the target/event/exit cohorts, so would it be easiest to construct cohortAnalysisTable by appending the current analysisId? Also, we aim to use camelCase in R code per OHDSI code guidelines here: https://ohdsi.github.io/Hades/codeStyle.html#Interfacing_between_R_and_SQL. If you need to move between snake & camelCase, DatabaseConnector has some utility functions described in that link.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes that would be the easiest. I originally went with this to make it easier when making the queries for OhdsiReportGenerator but it should still work in this format.
This PR updates the TreatmentPatterns Module to enable multiple analyses to be run simultaneously.
Summary of Changes
Added createModuleSpecification:
This function builds a TreatmentPatterns analysis spec from lists of target, event, and exit cohort definitions, converting them into the cohort data frame the package expects, and returns an analysis settings list.
Added createMultiAnalysisModuleSpecification:
This function takes a list of analysis specifications made by createAnalysisSpecification
Modified createModuleSpecification:
I modified the return specification to be wrapped by a list so it matches the createMultiAnalysisModuleSpecification returned results. Now it returns a list in the form of list(tpAnalysisList = list())
Modified execute:
I modified this function so it can loop through the tpAnalysisList and run each treatment pattern. All the logs of passed and failed pathways are written to the treatment_pathway_runs.csv in the work folder. The execute method also makes a new table in the results folder called analysis_cohorts, which has the columns:
event_cohort_name, event_cohort_id, target_cohort_id, target_event_id, and analysis_id
Modified data model:
Modified the model to allow the analysis_cohorts table to be inserted
eg.
tp <- TreatmentPatternsModule$new()
modSpecs <- list(
tp$createAnalysisSpecification(...),
tp$createAnalysisSpecification(...),
)
tpModuleSpecifications <- tp$createMultiAnalysisModuleSpecification(tpAnalysis = modSpecs)