-
Notifications
You must be signed in to change notification settings - Fork 0
Final CRAN prep #12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
rafdoodle
wants to merge
40
commits into
main
Choose a base branch
from
dev
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Final CRAN prep #12
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
A guide for refactoring CHMSFLOW functions to support vector operations using tidyverse patterns.
Improves code maintainability and aligns with tidyverse guidelines.
- Establish measure-specific missing data precedence logic:
* For demographics-based measures: tagged_na("a") takes precedence
* Rationale: If core demographics are "not applicable", entire measure invalid
* Mixed codes (6+7): Result is tagged_na("a") not tagged_na("b")
- Update precedence order in both alcohol functions:
* Check "not applicable" (6) before "missing" (7,8,9)
* Add detailed comments explaining clinical rationale
* Document that precedence logic should be measure-specific
- Provide template guidance for other derived measures:
* Demographics-based: tagged_na("a") precedence
* Symptom/behavior-based: tagged_na("b") may take precedence
* Decision depends on clinical logic and survey design intent
…entation, tests, and variable-details as a result
Vectorization of CHMS functions
…ariables; improved readability of main README and meds qmd
Comprehensive catalog of 13 CHMS databases following Dublin Core standards: Fully validated databases (Cycles 1-6): - Cycle 1 (2007): Id=10263, 15 sites, 5,604 participants, ages 6-79 - Cycle 2 (2009-2011): Id=10264, 18 sites, 6,395 participants, ages 3-79 - Cycle 3 (2012-2013): Id=136652, ~5,500 participants, ages 3-79 - Cycle 4 (2014-2015): Id=148760, 16 sites, 5,794 participants, ages 3-79 * First cycle with Hepatitis C RNA testing - Cycle 5 (2016-2017): Id=251160, ~5,700 participants, ages 3-79 - Cycle 6 (2018-2019): Id=1195092, ages 3-79 * Data often combined with Cycle 5 for analysis Partially validated (Cycle 7): - Collection: 2020-2021 - Survey ID marked with TODO for verification Medication files (cycle1_meds through cycle6_meds): - Reference parent cycle survey IDs - Available through RDC Validation process: - URLs verified against Statistics Canada IMDB pages - Precise collection dates confirmed from official documentation - Sample sizes validated from data user guides - Access restrictions updated: RDC only (no PUMF available) Dublin Core fields included: - title, description, creator, publisher - subject, date, type, language - identifier (DOI, catalogue number, SDDS) - coverage (spatial, temporal, population) Future work: - Validate Cycle 7 survey ID and collection details - Add final sample sizes when available - Verify data user guide URLs for newer cycles
Schema documentation for recodeflow metadata structure applied to CHMS: Schema files (inst/metadata/schemas/chms/): - variables.yaml: Field definitions for variables.csv (variable, label, variableType, databaseStart, variableStart, etc.) - variable_details.yaml: Field definitions for variable-details.csv (variable, recodes, categories, typeStart, typeEnd, etc.) - chms_database_config.yaml: CHMS-specific database configuration (valid cycles, selection strategies, CHMS observations) Documentation (inst/metadata/README.md): - Distinguishes recodeflow conventions vs CHMS-specific patterns - Explains variableStart format patterns: * Bracket format: [varname] for consistent names * Cycle-prefixed: cycle1::varname for cycle-specific names * Mixed format: cycle1::var1, [var2] (override + default pattern) * DerivedVar: DerivedVar::[var1, var2] for calculated variables - Documents range notation for categorical recodes: * Integer ranges: [7,9] includes 7,8,9 * Continuous ranges: [18.5,25) for BMI categories * Special values: 'else' as catch-all - CHMS observations: Cycle 1 often used different variable names than Cycles 2-6 (handled via mixed format convention) Purpose: These schemas document the metadata structure that MockData functions parse and validate. They are independently useful for understanding CHMS metadata conventions and serve as reference documentation for anyone working with variables.csv and variable-details.csv files.
…cles 1-6, as well as targeted sample size for cycle 7; corrected url for cycle3_meds and removed broken user guide links for cycles 3-6
Add Dublin Core metadata for CHMS database cycles
…ed mentions of validate-metadata.R in inst/metadata README
Add CHMS metadata schemas for variables and variable_details
README, explaining dependency restoration and local install process with renv and devtools. renv remains ignored from package build, as per typical CRAN approach.
CRAN prep: Add renv, update package metadata, and debug vignettes
…k-data-review branch
…most 100% test coverage; fixed NA handling of medication functions
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hello Doug,
Hope all is well. After spending the last few days with Claude improving test coverage and documentation, I have prepared one final pull request for you to review
chmsflowbefore we submit it to CRAN for the first time.Your main tasks are to:
LICENSEandDESCRIPTIONare correct.is_taking_drug_classfunction is needed for the package, as its use in thecycles1to2_*medication functions was removed during function vectorization.Upon your approval, I will merge all these changes to
main, from which I will submit the tarball (.tar.gz) to CRAN. Then, upon CRAN approval, I will create a release (branch) on GitHub for the packages's first version (0.1.0).From my end already,
devtools::check()passes with no errors, warnings, and notes, and I already confirmed with Claude thatchmsflowmeets the CRAN guidelines outlined for source packages.Please let me know what you think.
Sincerely,
Rafidul