I'm having trouble modifying a new column.
import pandas as pd
import numpy as np
da=np.array([[[1,2], 1,2,100],[[1,2], 1,3,100],[[1,2], 4,1,100], [[1,2], 5,6,101], [[1,2], 7,9,102], [[1,2], 8,7,102]])
col = ['N', 'NRW', 'NRW_1', 'NRW_2']
ramka = pd.DataFrame(columns = col, data = da)
print (ramka)
ramka['TEST'] = ramka['NRW'].apply(lambda x: x+3)
print (ramka)
It's ok here. I get a new column with NRW+3. NRW unchanged.
ramka['TEST2'] = ramka['N'].apply(lambda x: x.append(555))
print (ramka)
Here I get new column with None. Column N has been changed. Why?
If I make a copy of column N
ramka['TEST4'] = ramka['N']
and perform the same operation on a new column, the changes will be applied to both columns N and TEST4
ramka['TEST4'].apply(lambda x: x.append(666))
print (ramka)
I don't understand. Please help.
list.append
is in place (and thus modifies the lists in N
and returns None
), you should use:
ramka['TEST2'] = ramka['N'].apply(lambda x: x+[555])
ramka['TEST4'] = ramka['N'].apply(lambda x: x+[666])
Similarly, when you run ramka['TEST4'] = ramka['N']
this doesn't make a copy of the lists, but just references them a second time. To really make a copy you would need:
ramka['TEST4'] = ramka['N'].apply(lambda x: x.copy())
Note that to add a scalar to a numeric column you should not use apply
but:
ramka['TEST'] = ramka['NRW']+3
Output:
N NRW NRW_1 NRW_2 TEST TEST2 TEST4
0 [1, 2] 1 2 100 4 [1, 2, 555] [1, 2, 666]
1 [1, 2] 1 3 100 4 [1, 2, 555] [1, 2, 666]
2 [1, 2] 4 1 100 7 [1, 2, 555] [1, 2, 666]
3 [1, 2] 5 6 101 8 [1, 2, 555] [1, 2, 666]
4 [1, 2] 7 9 102 10 [1, 2, 555] [1, 2, 666]
5 [1, 2] 8 7 102 11 [1, 2, 555] [1, 2, 666]