I was using pandas eval within a where that sits inside a function in order to create a column in a data frame. While it was working in the past, not it doesn't. There was a recent move to Python 3 within our dataiku software. Could that be the reason for it?
Below will be the code that is now in place
import pandas as pd, numpy as np
from numpy import where, nan
d = {'ASSET': ['X','X','A','X','B'], 'PRODUCT': ['Z','Y','Z','C','Y']}
MAIN_df = pd.DataFrame(data=d)
def val_per(ASSET, PRODUCT):
return(
where(pd.eval("ASSET== 'X' & PRODUCT == 'Z'"),0.04,
where(pd.eval("PRODUCT == 'Y'"),0.08,1.5)
)
)
MAIN_2_df = (MAIN_df.eval("PCT = @val_per(ASSET, PRODUCT)"))
The error received now is <class 'TypeError'>: unhashable type: 'numpy.ndarray'
You can change the last two lines with:
MAIN_2_df = MAIN_df.copy()
MAIN_2_df = val_per(MAIN_2_df.ASSET, MAIN_2_df.PRODUCT)
This approach will work faster for large dataframes. You can use a vectorized aproach to faster results.