-
Notifications
You must be signed in to change notification settings - Fork 334
hist(group): add validation & empty-data handling; docstring, guide and tests #658
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…and empty-group handling; add tests; fix folium tile attribution and BeautifyIcon textColor compatibility
| Constraints and behavior: | ||
| - ``group`` cannot be combined with ``bin_column``. | ||
| - ``group`` requires exactly one histogram value column. If more | ||
| than one value column is passed, a ``ValueError`` is raised. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to specify which exception is raised if a constraint is violated.
| - ``group`` cannot be combined with ``bin_column``. | ||
| - ``group`` requires exactly one histogram value column. If more | ||
| than one value column is passed, a ``ValueError`` is raised. | ||
| - If ``group`` does not reference an existing column (by label or |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not needed
| warnings.warn("It looks like you're making a grouped histogram with " | ||
| "a lot of groups ({:d}), which is probably incorrect." | ||
| .format(grouped.num_rows)) | ||
| if grouped.num_rows == 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the argument for this change in behavior?
| # This code is factored as a function for clarity only. | ||
| n = len(values_dict) | ||
| if n == 0: | ||
| # Create an empty figure to maintain a no-error contract on empty groups |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the argument for this change in behavior? Why is it important to avoid exceptions in this case?
|
|
||
| Notes and constraints: | ||
| - `group` cannot be used together with `bin_column`. | ||
| - `group` expects exactly one numeric value column (e.g., `'height'`). Passing multiple value columns raises a `ValueError`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the exception raised should be part of the API specification.
| Notes and constraints: | ||
| - `group` cannot be used together with `bin_column`. | ||
| - `group` expects exactly one numeric value column (e.g., `'height'`). Passing multiple value columns raises a `ValueError`. | ||
| - If `group` does not reference an existing column label or index, a `ValueError` is raised. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the exception raised should be part of the API specification.
| - `group` cannot be used together with `bin_column`. | ||
| - `group` expects exactly one numeric value column (e.g., `'height'`). Passing multiple value columns raises a `ValueError`. | ||
| - If `group` does not reference an existing column label or index, a `ValueError` is raised. | ||
| - If the data are empty for all groups, `hist` creates an empty figure and returns without error. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why?
|
Thanks for these improvements! I prefer that each pull request address one self-contained issue. Please split out the fixes to maps/tiles in a separate PR. I do prefer that we stick to .rst. |
Background
Table.histneeded clearer documentation and safer behavior for thegroupargument.Changes
Code
datascience/tables.py:histgroupis a legal column (label or index); otherwise raiseValueErrorwith a clear message.groupcannot be used withbin_column; and grouping allows only one numeric data column.t.hist('height', group='gender')t.hist('height', group='gender', side_by_side=True)Docs
docs/hist_grouping.md:.rstand add it to the TOC.)Tests
tests/test_hist_group.pyValueError.(Optional) Maps small fixes
datascience/maps.pyattrsupplied (Folium >= 0.20).text_color->textColoroption forBeautifyIconfor backward compatibility.Verification
Files changed (for quick review)
datascience/tables.py[around lines ~5310, ~5398, ~5420, ~5455]docs/hist_grouping.mdtests/test_hist_group.pydatascience/maps.pyCompatibility & Notes