I'm finally breaking free from the shackles of SPSS and am reveling in the freedom of Pandas and Python (love it). However, I'm trying to get a clearer picture of how the python Lambda function interacts in Pandas. It seems to pop up a lot. Here is an example I hope will clear up the murkiness.
After creating a new dataframe from a string split:
bs = fh['basis'].str.split(',',expand = True)
I want to rename all the variables by adding a "b" to the numeric headers. This works:
n = list(bs)
for x in n:
bs.rename(columns={x : 'b' + str(x)},inplace = True)
But I have a sneaking suspicion a lambda function would be better. However, this doesn't work:
bs.rename(columns=lambda x: x = 'b' + str(x), inplace=True)
I thought lambda acted as a function, so if I pass in a column header I can append a 'b' to it. But the "=" throws an error. Any quick observations would be much appreciated. Cheers!
I'd use add_prefix():
In [5]: bs = pd.DataFrame(np.random.rand(3,5))
In [6]: bs
Out[6]:
0 1 2 3 4
0 0.521593 0.088293 0.623103 0.099417 0.983149
1 0.009741 0.465654 0.414261 0.024086 0.039543
2 0.476219 0.918162 0.900815 0.126549 0.112388
In [7]: bs.add_prefix('b')
Out[7]:
b0 b1 b2 b3 b4
0 0.521593 0.088293 0.623103 0.099417 0.983149
1 0.009741 0.465654 0.414261 0.024086 0.039543
2 0.476219 0.918162 0.900815 0.126549 0.112388