Search code examples
pythonpandaslambda

Clarification/Musing on Python Lambda function in Pandas


I'm finally breaking free from the shackles of SPSS and am reveling in the freedom of Pandas and Python (love it). However, I'm trying to get a clearer picture of how the python Lambda function interacts in Pandas. It seems to pop up a lot. Here is an example I hope will clear up the murkiness.

After creating a new dataframe from a string split:

 bs = fh['basis'].str.split(',',expand = True)

I want to rename all the variables by adding a "b" to the numeric headers. This works:

 n = list(bs)
 for x in n:
     bs.rename(columns={x : 'b' + str(x)},inplace = True)

But I have a sneaking suspicion a lambda function would be better. However, this doesn't work:

 bs.rename(columns=lambda x: x = 'b' + str(x), inplace=True)

I thought lambda acted as a function, so if I pass in a column header I can append a 'b' to it. But the "=" throws an error. Any quick observations would be much appreciated. Cheers!


Solution

  • I'd use add_prefix():

    In [5]: bs = pd.DataFrame(np.random.rand(3,5))
    
    In [6]: bs
    Out[6]:
              0         1         2         3         4
    0  0.521593  0.088293  0.623103  0.099417  0.983149
    1  0.009741  0.465654  0.414261  0.024086  0.039543
    2  0.476219  0.918162  0.900815  0.126549  0.112388
    
    In [7]: bs.add_prefix('b')
    Out[7]:
             b0        b1        b2        b3        b4
    0  0.521593  0.088293  0.623103  0.099417  0.983149
    1  0.009741  0.465654  0.414261  0.024086  0.039543
    2  0.476219  0.918162  0.900815  0.126549  0.112388