Search code examples
pythonpython-3.xdataframetransformtransformation

Transform dataframe value to range value in Python 3


I have a dataframe with the values:

3.05
35.97
49.11
48.80
48.02
10.61
25.69
6.02 
55.36
0.42
47.87
2.26
54.43
8.85 
8.75
14.29
41.29
35.69
44.27
1.08

I want transform the value into range and give new value to each value. From the df we know the min value is 0.42 and the max value is 55.36. From range min to max, I want divide to 4 group which is:

0.42  - 14.15 transform to 1 
14.16 - 27.88 transform to 2
27.89 - 41.61 transform to 3
41.62 - 55.36 transform to 4

so the result I expected is

1
3
4
4
4
1
2
1
4
1
4
1
4
1
1
2
3
3
4
1

Solution

  • This is normally called binning, but pandas calls it cut. Sample code is below:

    import pandas as pd
    
    # Create a list of numbers, with a header called "nums"
    data_list = [('nums', [3.05, 35.97, 49.11, 48.80, 48.02, 10.61, 25.69, 6.02, 55.36, 0.42, 47.87, 2.26, 54.43, 8.85, 8.75, 14.29, 41.29, 35.69, 44.27, 1.08])]
    
    # Create the labels for the bin
    bin_labels = [1,2,3,4]
    
    # Create the dataframe object using the data_list
    df = pd.DataFrame.from_items(data_list)
    
    # Define the scope of the bins
    bins = [0.41, 14.16, 27.89, 41.62, 55.37]
    
    # Create the "bins" column using the cut function using the bins and labels
    df['bins'] = pd.cut(df['nums'], bins=bins, labels=bin_labels)
    

    This creates a dataframe which has the following structure:

    print(df)
    
         nums bins
    0    3.05    1
    1   35.97    3
    2   49.11    4
    3   48.80    4
    4   48.02    4
    5   10.61    1
    6   25.69    2
    7    6.02    1
    8   55.36    4
    9    0.42    1
    10  47.87    4
    11   2.26    1
    12  54.43    4
    13   8.85    1
    14   8.75    1
    15  14.29    2
    16  41.29    3
    17  35.69    3
    18  44.27    4
    19   1.08    1