Skip to content

Conversation

@yulric
Copy link
Contributor

@yulric yulric commented Nov 18, 2025

The purpose of this PR is to prepare the dev branch for merging the changes in from the v3.0.0 branch. To that end it does the following:

  1. Adds the new columns from the v3.0.0 branch into the variables and variable details sheet in the same order
  2. Adds a new set of functions to check and fix formatting issues in a worksheet.

* The new columns are used to provide versioning information for each
  variable.
* The version column is set to 2.2.0 which is the current version of the
  package
* The lastUpdated column is set to the date in the v3.0.0 branch. The date
  does not really matter for this commit.
* The status column is set to active.
The default value was bought over from the v3.0.0 branch
The columns now match up with what's in the v3.0.0 branch which should
improve git diffs and make it easier to review changes
The column order now matches up with what's in the v3.0.0 branch which
should improve diffs and make it easier to review changes.
@yulric yulric force-pushed the prepare-for-v3 branch 6 times, most recently from 9a82075 to 8e43bb5 Compare December 29, 2025 19:51
This CEP proposes a standardization tool for variables.csv and
variable_details.csv to ensure consistent formatting across different
editors and operating systems, enabling clean semantic diffs in version
control.
@yulric yulric requested a review from DougManuel December 29, 2025 20:09
@yulric yulric marked this pull request as ready for review December 29, 2025 20:09
Copy link
Contributor

@DougManuel DougManuel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments:

  1. Potentially missing columns - The YAML schemas in inst/metadata/schemas/core/variables.yaml include notes in the expected order, and variable_details.yaml includes templateVariable as an optional field.

Should these be added?

  1. If you remember, I drafted a similar validation for my work-in-progress. That is in feature/csv-standardisation-updates. Here is a comparison of features, in case you wanted to look that the validations that I included. The pattern, enum and cross-fields were errors that I was dealing with when reviewing and creating code.
Feature csv-standardisation-updates This PR
Column order validation
Row sorting validation
Line ending validation (LF/CRLF)
Excessive quoting detection
Trailing empty columns
Pattern validation (regex)
Enum validation
Cross-field validation
Auto-fix capability
GitHub Action
  1. I validated using the rules in The YAML schemas in inst/metadata/schemas/core/variables.yaml for column names, ordering, etc. dynamically, rather than hardcoding. The "source of truth" is an interesting discussion.

I also drafted a more extensive report inspired by pkgdown and devtools. I like your error reporting very much, but take a look that that branch if you are interested.

Minor Items

  • There's a capture.output(print(row_being_checked), file = "log.txt", append = TRUE) in recode-with-table.R:932 that looks like debug code - should that be removed?
  • readr is used in fix_worksheet() but isn't in DESCRIPTION. I think the validation should be a pernament feature of cchsflow, so adding that to description is a consideration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants