Search code examples
pythonpandasdataframefor-looplogarithm

Python: How can I add columns to a dataframe containing the log of some other columns using a for loop?


I have a given dataframe with three columns containing random numbers. What I want to do is use a for loop for adding three more columns containing the log-values of the corresponding column.

My dataframe is given by:

    K           L           Y
0   44.482983   22.612093   19.160614
1   44.131591   21.071627   44.804061
2   46.188112   21.420053   10.296304
3   38.231555   23.777519   19.128269
4   40.289477   32.450482   23.141743
...     ...     ...     ...
99995   48.793839   33.907988   35.769701
99996   41.654043   34.899131   14.866854
99997   49.602684   20.047823   11.387398
99998   47.265013   30.397463   36.708146
99999   49.375947   39.978109   45.814494
100000 rows × 3 columns

100000 rows × 3 columns

Using the following lines gives me the desired result:

data['k'] = np.log(data['K'])
data['l'] = np.log(data['L'])
data['y'] = np.log(data['Y'])

The resulting dataframe looks like this:

    K           L           Y           k           l           y
0   44.482983   22.612093   19.160614   3.795107    3.118485    2.952857
1   44.131591   21.071627   44.804061   3.787176    3.047927    3.802299
2   46.188112   21.420053   10.296304   3.832722    3.064328    2.331785
3   38.231555   23.777519   19.128269   3.643661    3.168741    2.951167
4   40.289477   32.450482   23.141743   3.696090    3.479715    3.141638
...     ...     ...     ...     ...     ...     ...
99995   48.793839   33.907988   35.769701   3.887604    3.523651    3.577101
99996   41.654043   34.899131   14.866854   3.729398    3.552462    2.699134
99997   49.602684   20.047823   11.387398   3.904045    2.998121    2.432507
99998   47.265013   30.397463   36.708146   3.855770    3.414359    3.602999
99999   49.375947   39.978109   45.814494   3.899463    3.688332    3.824600

100000 rows × 6 columns

What I tried was ...

for i in ['k', 'l', 'y']:
    for j in ['K', 'L', 'Y']:
        data[i] = np.log(data[j])

... but this only adds three columns containing the log of 'K'.

Where is my mistake within the for loop?


Solution

  • Your mistake is that you have to iterate on the two arrays at the same time, in the same loop, not in two nested loops.

    The zip command is usefull in this context.The following code should work :

    for i,j in zip(['k', 'l', 'y'],['K', 'L', 'Y']):
        data[i] = np.log(data[j])