python pandas optimization floating-accuracy

multiplying float columns in pandas takes too much time

Morining to all, I have a 460.000 rows DataFrame, with 15 columns. I'm trying to assign to one column the product of another two the code is like this

df[df.colx == 'S']['prd'] = df['col1']*df['col2']

prd, col1 and col2 have float64 as data type. I executed a lot of operations on other columns with no problem, including date difference, and they are almost instantly executed. if I try

df['prd'] =  df['col1']*df['col2']

the execution is super fast. the problem raises when I try to apply the operation on a subset of the DataFrame Someone can help me and explain how I can lower the execution time? Thank you very much!

UPDATE: if if do

df2 = pd.DataFrame(df[df.colx=='S'])

and then

df2['prd'] =  df['col1']*df['col2']

is still super slow......... oh is it possible? df2 should be a new DataFrame.......

Solution

Try to seperate the operations:

df2 = df[df.colx == 'S']
df2['prd'] = df2['col1]*df2['col2']

or if the df.colx == 'S'is some condition for you, you can run:

df['prd'] = numpy.where(df['prod'] == 'S', df['col1']*df['col2'], 'Do something else')

just replace Do something else with another logical opartion which should be done if df.colx != 'S'