i have a df with MachineType, Prod/RT, and several other columns. MachineType contains either TRUE or FALSE. need to .fillna and .replace but in different ways for MachineType. (filling values are different for TRUE and FALSE)
Dataframe : updatedDf
my code do above calc:
updatedDf['Prod/RT']=updatedDf[updatedDf['MachineType']==True]['Prod/RT'].replace(np.inf,0.021660)
updatedDf['Prod/RT']=updatedDf[updatedDf['MachineType']==True]['Prod/RT'].fillna(0.021660)
updatedDf['Prod/RT']=updatedDf[updatedDf['MachineType']==False]['Prod/RT'].replace(np.inf,0.050261)
updatedDf['Prod/RT']=updatedDf[updatedDf['MachineType']==False]['Prod/RT'].fillna(0.050261)
But my code gives an unexpected output with Nan values. Is there any way to fix this error?or can't we .fillna and .replace like above way?
My approach to your problem is to wrap the filling and replacing in a function and use it as parameter in pandas .apply()
. Using you approach will require the usage of .loc[]
.
updatedDf = pd.DataFrame({
'MachineType' : np.random.choice([True, False], 10, True),
'Prod/RT' : np.random.choice([np.nan, np.inf, random.random()], 10, True)
})
# solution 1
prod_RT_dict = {True:0.21660, False:0.050261}
def fillProd_RT(row):
if row['Prod/RT'] != np.inf and pd.notna(row['Prod/RT']):
return row['Prod/RT']
else:
return prod_RT_dict[row['MachineType']]
updatedDf['Prod/RT_2'] = updatedDf.apply(fillProd_RT, axis=1)
# solution 2
updatedDf['Prod/RT_3']=updatedDf['Prod/RT'].replace(np.inf,np.nan)
updatedDf.loc[updatedDf['MachineType']==True,'Prod/RT_3']=updatedDf\
.loc[updatedDf['MachineType']==True,'Prod/RT_3'].fillna(0.021660)
updatedDf.loc[updatedDf['MachineType']==False,'Prod/RT_3']=updatedDf\
.loc[updatedDf['MachineType']==False,'Prod/RT_3'].fillna(0.050261)
updatedDf