Skip to content

diet score#148

Closed
caitlink12 wants to merge 15 commits intoadl-additionsfrom
diet-score
Closed

diet score#148
caitlink12 wants to merge 15 commits intoadl-additionsfrom
diet-score

Conversation

@caitlink12
Copy link

@caitlink12 caitlink12 commented Oct 10, 2025

FVCDFRU, FVCDSAL, FVCDPOT, FVCDCAR, FVCDVEG, FVCDJUI (diet_score dependent variables) updated to include master survey cycles from 2001-2018
diet_score updated to include master survey cycles from 2001-2018
diet_score_cat3 updated to include master survey cycles from 2001-2018
master survey cycles added to FVCDFRU, FVCDSAL, FVCDPOT, FVCDCAR, FVCDVEG, FVCDJUI, diet_score and diet_score_cat3 in variables.csv

added in ICES surveys to diet score derived variable
updated function diet_score_cat3 to include master survey cycles
added in master survey cycles for required diet score variables to derive diet score
adjusted spacing between variable entries
included master survey cycles for diet score and diet-score dependent variables
@yulric yulric changed the base branch from feature/v3.0.0-validation-infrastructure to adl-additions October 15, 2025 19:03
Copy link
Contributor

@yulric yulric left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The database names are not consistent between the variables and variable details sheet. The variable details sheet uses the _i suffix whereas the variables sheet uses the _m suffix. Can you fix that?

all _i suffixes changed to _m
edited cchs20013_2014_m to cchs2013_2014_m
Copy link
Contributor

@yulric yulric left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still seeing the extra zeros in some lines. I've replied to the comments that have them.

removed extra zero in cchs2013_2014_m cycle name for FVCDCAR, FVCDFRU, FVCDJUI, FVCDPOT, FVCDSAL, FVCDTOT, FVCDVEG
…s; fixed all their databaseStart and variableStart info too
@DougManuel
Copy link
Contributor

DougManuel commented Feb 11, 2026

Reviewed the FVC and diet_score variable changes (FVCDFRU, FVCDSAL, FVCDPOT, FVCDCAR, FVCDVEG, FVCDJUI, diet_score, diet_score_cat3). No issues found. Checked for:

  • variableStart mapping correctness: All pre-2007 cycle-letter mappings (FVCA/FVCC/FVCE prefixes), 2007-2014 defaults, and 2015+ renamed variables (FVCDVFRU, FVCDVGRN, FVCDVORA, FVCDVPOT, FVCDVVEG, FVCDVJUI) are correct for both PUMF and master cycles.
  • databaseStart consistency: variables.csv and variable_details.csv databaseStart fields match for all 8 variables. All 9 master cycles (cchs2001_m through cchs2017_2018_m) present.
  • Dangerous [VAR] defaults: No [VAR] defaults cross the 2015 naming boundary — 2015+ master cycles all have explicit mappings.
  • FVCDJUI/FVCDPOT _i to _m correction: Correctly replaced deprecated _i suffix with _m. The cchs20013_2014_i typo was fixed.
  • units field: "times/day" added consistently to all FVC variable rows in variable_details.csv.
  • Derived variables: diet_score and diet_score_cat3 correctly reference harmonized variable names via DerivedVar::.

Note: This review covers only the FVC/diet_score variables. The variables.csv full-file rewrite (column reordering, new metadata columns) was not reviewed here.

@DougManuel
Copy link
Contributor

DougManuel commented Feb 11, 2026

Code review

Reviewed 9 derived FVC/diet variables and 30 raw FVC variables for PUMF and Master across 2001-2018 cycles.

L6 integration test: cross-cycle prevalence

Ran rec_with_table() against PUMF data for all 12 cycles:

No step changes at the 2014-2015 era boundary. The 2015+ variable renames (FVCDVFRU, FVCDVGRN, FVCDVORA, FVCDVPOT, FVCDVVEG, FVCDVJUI) are correctly mapped.

Master (_m) mappings validated by worksheet checks only -- no runtime data available for L6 testing.

Issues found and fixed

All issues below have been fixed in commits 8939654 and d8f3890.

P0 -- data bug:

  1. FVC_6D swapped recEnd values: valid range [1,120] was mapped to NA::a and not-applicable code 996 was mapped to copy -- reversed at runtime, producing wrong data

P1 -- naming/mapping errors:
2. Database name typo in all 30 FVC_* raw variables: chs2011_2012_m and chs2013_2014_m should be cchs2011_2012_m and cchs2013_2014_m (missing leading c). Affected 174 rows across both worksheets.
3. Dummy variable suffix-recEnd misalignment: FVC_1A and FVC_5A had suffixes that didn't match their recEnd values (copy-paste error)
4. Dummy variable naming: renamed _NA::a/_NA::b to _NAa/_NAb for FVC_1A-6A and diet_score_cat3; set Func row dummyVariable to N/A

P2 -- metadata quality:
5. Fixed double spaces in FVC_4E, FVC_5E labels and catLabels
6. Fixed FVCDTOT "consumptoin" typo and trailing dash in labelLong
7. Fixed inconsistent labelLong formatting for FVCDFRU, FVCDSAL, FVCDVEG
8. Added descriptions for 7 FVCD* derived frequency variables
9. Improved diet_score_cat3 labelLong and description
10. Added reviewNotes for 30 FVC_* variables

Schema:

Checked: era boundary defaults, databaseStart consistency (variables.csv vs variable_details.csv), PUMF/Master naming conventions, pre-2007 cycle letters, DV function specifications (diet_score_fun, diet_score_fun_cat), dummyVariable naming, swapped recEnd values, label/metadata consistency, known error patterns, and PUMF integration across all cycles.

Note: Post-approval commit 16a8f3a was pushed on 2026-02-10 (after yulric's approval on 2025-12-04). Review fixes applied in commits 8939654 and d8f3890.

CEP: ceps/cep-007-diet/

variables.csv:
- Fix chs→cchs typo in databaseStart for 30 FVC_* variables (2011/2013 master)
- Fix double spaces in FVC_4E, FVC_5E labels
- Fix FVCDTOT "consumptoin" typo and trailing dash in labelLong
- Fix inconsistent labelLong formatting for FVCDFRU, FVCDSAL, FVCDVEG
- Add descriptions for 7 FVCD* derived frequency variables
- Improve diet_score_cat3 labelLong and description
- Add reviewNotes for 30 FVC_* variables

variable_details.csv:
- Fix chs→cchs typo in databaseStart/variableStart for 144 FVC_* rows
- Fix FVC_6D swapped recEnd values (P0 data bug: valid range mapped to NA)
- Fix FVC_1A and FVC_5A dummy variable suffix-recEnd misalignment
- Rename _NA::a/_NA::b to _NAa/_NAb for FVC_1A-6A and diet_score_cat3
- Set diet_score_cat3 Func row dummyVariable to N/A
- Fix double spaces in FVC_4E, FVC_5E catLabel and related fields
- Fix metadata_registry.yaml: replace _NA::[a-z] with _NA[a-z] in regex,
  remove _cont{N} pattern (continuous variables use N/A), add missing
  category examples (_NAa, _NAb)
- Fix variable_details.yaml: update dummyVariable notes to match actual
  convention with regex pattern and correct examples

See issue #172 for propagation to other branches.
Copy link
Contributor

@DougManuel DougManuel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved. All issues from the review have been fixed in commits 8939654 and d8f3890 (see updated review comment above).

@rafdoodle rafdoodle changed the base branch from adl-additions to v3 February 13, 2026 15:20
@rafdoodle rafdoodle changed the base branch from v3 to adl-additions February 13, 2026 15:20
Review summary, L6 integration test script, PUMF test results,
and cross-cycle prevalence QMD from the diet-score PR review.
rafdoodle added a commit that referenced this pull request Feb 13, 2026
@rafdoodle rafdoodle closed this Feb 13, 2026
@rafdoodle rafdoodle deleted the diet-score branch February 13, 2026 18:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants