I want to set the labels of a binned histogram automatically based on the cut-intervals. The data-bins are created by applying pd.cut() on a dataframe. The list of the pd.cut is specified manually (see cut list), but I want the histogram labels to be set automatically based on the cut-list. How do I convert the cut-list to a label list using code?
#cut list
cut = [0,20,40,60,80,100]
#desired label list
label = ['[0-20]', ']20-40]', ']40-60]', ']60-80]', ']80-100]']
#to be used for:
pd_cut = pd.cut(df, cut, labels=label, include_lowest=True).astype(str)
You can use zip
to go through the pairs, and keep updating the list label
:
cut = [0,20,40,60,80,100]
label = []
for i, p in enumerate(zip(cut, cut[1:])):
ob = '[' if i == 0 else ']'
label.append('{}{}-{}]'.format(ob, *p))
print(label)
Output:
['[0-20]', ']20-40]', ']40-60]', ']60-80]', ']80-100]']
Besides zip
, enumerate
, and the slicing, you can use a classic for loop with range
and len
:
for i in range(len(cut) - 1):
ob = '[' if i == 0 else ']'
label.append('{}{}-{}]'.format(ob, cut[i], cut[i + 1]))