-
Notifications
You must be signed in to change notification settings - Fork 27
replacing median with 50th percentile #97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
replacing median with 50th percentile #97
Conversation
Codecov Report
@@ Coverage Diff @@
## main #97 +/- ##
==========================================
+ Coverage 74.39% 74.44% +0.05%
==========================================
Files 32 32
Lines 2882 2888 +6
==========================================
+ Hits 2144 2150 +6
Misses 738 738
|
sfinkens
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch, thanks! Your proposed solution looks good to me!
Regarding code cleanliness, I think it would make sense to create a helper method for the 50th percentile, so that the keyword args don't have to be repeatet.
def fiftieth_percentile(data):
return np.percentile(data, 50, interpolation='nearest')And please make sure to add a test ;)
|
@oliversus do you have time to fix this soon? |
Hi Martin and Stephan, |
sfinkens
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work, thanks a lot!
|
|
||
| return sat_azi, sat_zenith, sun_azi, sun_zenith, rel_azi | ||
|
|
||
| def _get_true_median(self, input_array, q=50, interpolation='nearest'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh wait :D If you have some spare time left...
This method doesn't access any class attribute, so I'd propose to move it out of the class to the module level. Or maybe to pygac.utils?
Method
correct_times_medianshould provide median values for year, doy, and milliseconds to correct invalid timestamps. However,np.medianwill return floats, which could even be fractions due tonp.median's internal interpolation method of the two closest values. I'd suggest to usenp.percentileinstead, where one can explicitly set the interpolation method. Setting this to "nearest" will always return the value closest to the statistical median, thus not changing its type.np.median([1, 2, 3, 4, 5, 6])returns 3.5
np.percentile([1, 2, 3, 4, 5, 6], 50, interpolation="nearest")returns 3
This will also avoid a
ValueErrorin methodto_datetime64, line 449, where conversion to datetime64 is only possible with year as an integer.