-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Description
For numeric variables with few numeric values.. say 2-3, the percentile values will be same across a lot of points. Hence, the condition in np.where(arr > arr[start]) might break and return the wrong lowest percentile, causing the program to be stuck in the while loop.
def get_next_range(arr,group_range,start):
if group_range + start >=100:
return 100
elif (100 - group_range/2) < start + group_range:
return 100
elif arr[-1] == arr[start]:
return 100
elif (arr[start+group_range] == arr[start]) or (arr[start] < 0):
return np.max([np.min(np.where(arr > arr[start])),np.min(np.where(arr >= 0))])
else:
return group_range + start
For rectification of this error, percentile values after calculation must be rounded off to some fixed decimal values
Something like the following
percentiles = np.around(np.array([np.percentile(df1[var],p) for p in range(0,100)]), decimals = 5)
will fix this issue
Metadata
Metadata
Assignees
Labels
No labels