Fix calculated variables being left out of SingleYearDataset objects#23
Merged
juaristi22 merged 6 commits intomainfrom Aug 22, 2025
Merged
Fix calculated variables being left out of SingleYearDataset objects#23juaristi22 merged 6 commits intomainfrom
juaristi22 merged 6 commits intomainfrom
Conversation
juaristi22
commented
Aug 20, 2025
baogorek
approved these changes
Aug 21, 2025
Collaborator
baogorek
left a comment
There was a problem hiding this comment.
My only comments related to documentation (e.g., docstrings) and are non-blocking. I did have a bit of trouble figuring out exactly what was going on. Finally I get it. This Claude Code summary was really nice I thought:
● The PR fixes a bug where pre-computed income values were being lost during dataset minimization.
When creating smaller dataset subsets for calibration, the code was only copying "input variables" (raw data like age, state) but dropping "calculated variables" (pre-computed
values like employment_income, self_employment_income stored in the original dataset).
This resulted in minimized datasets with all zeros for income fields, making them useless for calibration.
Maria's fix:
1. Identifies which variables in the dataset are calculated (not inputs)
2. Explicitly preserves these calculated variables and their values when creating subsets
3. Ensures income data remains intact throughout the calibration pipeline
The tests confirm that employment_income, self_employment_income, and weekly_hours_worked now retain their non-zero values after minimization.
If there's any risk that other users are going to hit this in the future and not know what happened, is there any way to warn them proactively?
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fix #22