Skip to content

Conversation

@yulric
Copy link
Contributor

@yulric yulric commented Nov 18, 2025

The purpose of this PR is to prepare the dev branch for merging the changes in from the v3.0.0 branch. To that end it does the following:

  1. Adds the new columns from the v3.0.0 branch into the variables and variable details sheet in the same order
  2. Adds a new set of functions to check and fix formatting issues in a worksheet.

* The new columns are used to provide versioning information for each
  variable.
* The version column is set to 2.2.0 which is the current version of the
  package
* The lastUpdated column is set to the date in the v3.0.0 branch. The date
  does not really matter for this commit.
* The status column is set to active.
The default value was bought over from the v3.0.0 branch
The columns now match up with what's in the v3.0.0 branch which should
improve git diffs and make it easier to review changes
The column order now matches up with what's in the v3.0.0 branch which
should improve diffs and make it easier to review changes.
@yulric yulric force-pushed the prepare-for-v3 branch 6 times, most recently from 9a82075 to 8e43bb5 Compare December 29, 2025 19:51
This CEP proposes a standardization tool for variables.csv and
variable_details.csv to ensure consistent formatting across different
editors and operating systems, enabling clean semantic diffs in version
control.
@yulric yulric force-pushed the prepare-for-v3 branch 2 times, most recently from 9792e8a to 642fc30 Compare December 29, 2025 20:00
@yulric yulric requested a review from DougManuel December 29, 2025 20:09
@yulric yulric marked this pull request as ready for review December 29, 2025 20:09
Copy link
Contributor

@DougManuel DougManuel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments:

  1. Potentially missing columns - The YAML schemas in inst/metadata/schemas/core/variables.yaml include notes in the expected order, and variable_details.yaml includes templateVariable as an optional field.

Should these be added?

  1. If you remember, I drafted a similar validation for my work-in-progress. That is in feature/csv-standardisation-updates. Here is a comparison of features, in case you wanted to look that the validations that I included. The pattern, enum and cross-fields were errors that I was dealing with when reviewing and creating code.
Feature csv-standardisation-updates This PR
Column order validation
Row sorting validation
Line ending validation (LF/CRLF)
Excessive quoting detection
Trailing empty columns
Pattern validation (regex)
Enum validation
Cross-field validation
Auto-fix capability
GitHub Action
  1. I validated using the rules in The YAML schemas in inst/metadata/schemas/core/variables.yaml for column names, ordering, etc. dynamically, rather than hardcoding. The "source of truth" is an interesting discussion.

I also drafted a more extensive report inspired by pkgdown and devtools. I like your error reporting very much, but take a look that that branch if you are interested.

Minor Items

  • There's a capture.output(print(row_being_checked), file = "log.txt", append = TRUE) in recode-with-table.R:932 that looks like debug code - should that be removed?
  • readr is used in fix_worksheet() but isn't in DESCRIPTION. I think the validation should be a pernament feature of cchsflow, so adding that to description is a consideration.

@yulric
Copy link
Contributor Author

yulric commented Jan 8, 2026

@DougManuel Thanks for the review! Couple replies below,

  1. Feature disparity between your tool and this one: It looks like your tool has validation features which this tool is missing. The focus of this tool is to check and fix formatting issues. I definitely want to support validation though and have actually started work on that in this PR Add parse_variables_sheet function with validation recodeflow#85.
  2. Metadata YAML: There's a lot there and I was thinking for this PR that we would only include the items we need. Right now, it looks like we only need the expected_column_order and the id_column_name fields. I will delete the other items and add them in at a later time as needed. What do you think?

@yulric yulric force-pushed the prepare-for-v3 branch 4 times, most recently from 1c98e77 to a666706 Compare January 9, 2026 16:25
yulric and others added 2 commits January 9, 2026 11:26
@DougManuel
Copy link
Contributor

  1. Feature disparity between your tool and this one: It looks like your tool has validation features which this tool is missing. The focus of this tool is to check and fix formatting issues. I definitely want to support validation though and have actually started work on that in this PR Add parse_variables_sheet function with validation recodeflow#85.

For sure, keep this PR focused. I've actually been creating more checks as I encounter issues, but that is for the future.

@DougManuel
Copy link
Contributor

  1. Metadata YAML: There's a lot there and I was thinking for this PR that we would only include the items we need. Right now, it looks like we only need the expected_column_order and the id_column_name fields. I will delete the other items and add them in at a later time as needed. What do you think?

Agree.

Feel free to close the PR.

@yulric yulric merged commit ba9f5e7 into dev Jan 9, 2026
1 check passed
@yulric yulric deleted the prepare-for-v3 branch January 9, 2026 17:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants