Skip to content

Conversation

@rafdoodle
Copy link
Collaborator

@rafdoodle rafdoodle commented Jan 19, 2026

Hello Doug,

Hope all is well. After spending the last few days with Claude improving test coverage and documentation, I have prepared one final pull request for you to review chmsflow before we submit it to CRAN for the first time.

Your main tasks are to:

  • Check if information in LICENSE and DESCRIPTION are correct.
  • Check if PR Medications updates + new variable names #10 is necessary for CRAN submission and if it needs to be dealt with first.
  • Check if the is_taking_drug_class function is needed for the package, as its use in the cycles1to2_* medication functions was removed during function vectorization.
  • Check if package looks good and is working on your end (perhaps more in detail if you have the time).

Upon your approval, I will merge all these changes to main, from which I will submit the tarball (.tar.gz) to CRAN. Then, upon CRAN approval, I will create a release (branch) on GitHub for the packages's first version (0.1.0).

From my end already, devtools::check() passes with no errors, warnings, and notes, and I already confirmed with Claude that chmsflow meets the CRAN guidelines outlined for source packages.

Please let me know what you think.

Sincerely,
Rafidul

DougManuel and others added 30 commits September 18, 2025 14:30
A guide for refactoring CHMSFLOW functions to support vector operations using tidyverse patterns.
Improves code maintainability and aligns with tidyverse guidelines.
 - Establish measure-specific missing data precedence logic:
    * For demographics-based measures: tagged_na("a") takes precedence
    * Rationale: If core demographics are "not applicable", entire measure invalid
    * Mixed codes (6+7): Result is tagged_na("a") not tagged_na("b")

  - Update precedence order in both alcohol functions:
    * Check "not applicable" (6) before "missing" (7,8,9)
    * Add detailed comments explaining clinical rationale
    * Document that precedence logic should be measure-specific

  - Provide template guidance for other derived measures:
    * Demographics-based: tagged_na("a") precedence
    * Symptom/behavior-based: tagged_na("b") may take precedence
    * Decision depends on clinical logic and survey design intent
…entation, tests, and variable-details as a result
…ariables; improved readability of main README and meds qmd
Comprehensive catalog of 13 CHMS databases following Dublin Core standards:

Fully validated databases (Cycles 1-6):
- Cycle 1 (2007): Id=10263, 15 sites, 5,604 participants, ages 6-79
- Cycle 2 (2009-2011): Id=10264, 18 sites, 6,395 participants, ages 3-79
- Cycle 3 (2012-2013): Id=136652, ~5,500 participants, ages 3-79
- Cycle 4 (2014-2015): Id=148760, 16 sites, 5,794 participants, ages 3-79
  * First cycle with Hepatitis C RNA testing
- Cycle 5 (2016-2017): Id=251160, ~5,700 participants, ages 3-79
- Cycle 6 (2018-2019): Id=1195092, ages 3-79
  * Data often combined with Cycle 5 for analysis

Partially validated (Cycle 7):
- Collection: 2020-2021
- Survey ID marked with TODO for verification

Medication files (cycle1_meds through cycle6_meds):
- Reference parent cycle survey IDs
- Available through RDC

Validation process:
- URLs verified against Statistics Canada IMDB pages
- Precise collection dates confirmed from official documentation
- Sample sizes validated from data user guides
- Access restrictions updated: RDC only (no PUMF available)

Dublin Core fields included:
- title, description, creator, publisher
- subject, date, type, language
- identifier (DOI, catalogue number, SDDS)
- coverage (spatial, temporal, population)

Future work:
- Validate Cycle 7 survey ID and collection details
- Add final sample sizes when available
- Verify data user guide URLs for newer cycles
Schema documentation for recodeflow metadata structure applied to CHMS:

Schema files (inst/metadata/schemas/chms/):
- variables.yaml: Field definitions for variables.csv
  (variable, label, variableType, databaseStart, variableStart, etc.)
- variable_details.yaml: Field definitions for variable-details.csv
  (variable, recodes, categories, typeStart, typeEnd, etc.)
- chms_database_config.yaml: CHMS-specific database configuration
  (valid cycles, selection strategies, CHMS observations)

Documentation (inst/metadata/README.md):
- Distinguishes recodeflow conventions vs CHMS-specific patterns
- Explains variableStart format patterns:
  * Bracket format: [varname] for consistent names
  * Cycle-prefixed: cycle1::varname for cycle-specific names
  * Mixed format: cycle1::var1, [var2] (override + default pattern)
  * DerivedVar: DerivedVar::[var1, var2] for calculated variables
- Documents range notation for categorical recodes:
  * Integer ranges: [7,9] includes 7,8,9
  * Continuous ranges: [18.5,25) for BMI categories
  * Special values: 'else' as catch-all
- CHMS observations: Cycle 1 often used different variable names
  than Cycles 2-6 (handled via mixed format convention)

Purpose:
These schemas document the metadata structure that MockData functions
parse and validate. They are independently useful for understanding
CHMS metadata conventions and serve as reference documentation for
anyone working with variables.csv and variable-details.csv files.
…cles 1-6, as well as targeted sample size for cycle 7; corrected url for cycle3_meds and removed broken user guide links for cycles 3-6
Add Dublin Core metadata for CHMS database cycles
rafdoodle and others added 10 commits October 24, 2025 21:09
…ed mentions of validate-metadata.R in inst/metadata README
Add CHMS metadata schemas for variables and variable_details
README, explaining dependency restoration and local install process with renv and devtools.

renv remains ignored from package build, as per typical CRAN approach.
CRAN prep: Add renv, update package metadata, and debug vignettes
…most 100% test coverage; fixed NA handling of medication functions
@rafdoodle rafdoodle requested a review from DougManuel January 19, 2026 02:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants