Search code examples
pythonloggingscikit-learnnormalizationtransformation

How to do a log transformation on more than one attribute(column) - Python


I have a dataset with 2 columns that are on a completely different scales.

I need to do a log transformation on both columns to be able to do some visualization on them.

I cannot find a code for python that allows me to do the log transformation on several columns.

Can anybody help me?

I have a dataset with Qualitative and Quantitative columns and I wish to do the log on The RealizedPL and Volume columns.

My dataset looks a bit like this:

     Date           Name       Country     Product     RealizedPL     Volume
0    2019.01.01     Charles    Country1    ProductA      100           10200
1    2019.02.20     Pierre     Country2    ProductB      150           20500
2    2019.03.02     Chiara     Country1    ProductA      200           15300

How can I do the log transformation and keep the other columns as well? Either by creating new columns for the log or directly replacing the columns with the log.

Thank you


Solution

  • You may wish to try:

    df[["RealizedPL","Volume"]] = df[["RealizedPL","Volume"]].apply(np.log)
    print(df)
             Date     Name   Country   Product  RealizedPL    Volume
    0  2019.01.01  Charles  Country1  ProductA    4.605170  9.230143
    1  2019.02.20   Pierre  Country2  ProductB    5.010635  9.928180
    2  2019.03.02   Chiara  Country1  ProductA    5.298317  9.635608
    

    or:

    df[["RealizedPL_log", "Volume_log"]] = df[["RealizedPL","Volume"]].apply(np.log)
    

    to have logs as separate columns.

    Also note, if this is simply for visualization purposes, you may wish to try df.plot.scatter(..., logx=True, logy=True).