Search code examples
pythonpandasdataframescaling

Python : Scale columns in pandas dataframe


I got a dataframe with several columns. Each of the columns needs to be scaled with individual values and I would like to know if there is any oneliner that will scale the columns appropriate given a dictionary or something else.

Eg. scalingDictionary = {'a': 10, 'b': 5, 'c':0.1} df = pd.Dataframe({'a':[2,4,6,8], 'b':[3,6,9,12], 'c':[1,2,3,4]})

oneliner scaling ... where each column is multiplied with desired value from the dictionary should give the desired output

a    b    c
20   15   0.1
40   30   0.2
60   45   0.3
80   60   0.4

Solution

  • Multiple DataFrame with dictionary, working well if keys are same like columns names:

    df = df.mul(scalingDictionary)    
    print (df)
          a     b    c
    0  20.0  15.0  0.1
    1  40.0  30.0  0.2
    2  60.0  45.0  0.3
    3  80.0  60.0  0.4       
    

    If some columns not match:

    scalingDictionary = {'a': 10, 'b': 5} 
    
    df = pd.DataFrame({'a':[2,4,6,8], 'b':[3,6,9,12], 'c':[1,2,3,4]})
    
    df = df.mul(pd.Series(scalingDictionary).reindex(df.columns, fill_value=1))
    print (df)
        a   b  c
    0  20  15  1
    1  40  30  2
    2  60  45  3
    3  80  60  4
    

    Or:

    df = df.mul({**dict.fromkeys(df.columns, 1), **scalingDictionary})
    print (df)
        a   b  c
    0  20  15  1
    1  40  30  2
    2  60  45  3
    3  80  60  4