This is my DataFrame:
import pandas as pd
df = pd.DataFrame(
{
'a': ['long', 'long', 'short', 'long', 'short', 'short', 'short'],
'b': [1, -1, 1, 1, -1, -1, 1],
}
)
Expected output is creating column a_1
:
a b a_1
0 long 1 long
1 long -1 long
2 short 1 short
3 long 1 long
4 short -1 long
5 short -1 long
6 short 1 short
Logic:
a_1
should be created like this:
df.loc[df.b.eq(-1), 'a_1'] = 'long'
df['a_1'] = df.a_1.fillna(df.a)
This problem is really weird. When I try fillna
it does not work. I tried it with pandas version 1.2.4 and it worked but with version 2.1.4 it does not work.
This version is default version of Colab currently and I ran this code on Colab.
This appears to be caused by 2.1.4 generating NaNs as ‘nan’ when creating columns that are strings with only partial values. Whatever the cause, it is not recommended by Pandas to continuously update values that match a conditional statement. Pandas' mask
function is customised for this situation, so use it.
df['a_1'] = df['a'].mask(df['b'].eq(-1), 'long')