Search code examples
pandasnumpyevalvalueerror

pandas df.eval gives ValueError: data type must provide an itemsize


I'll give a sort simple code that reproduces the issue that I have:

import pandas as pd

df = pd.DataFrame(dict(a=[1,2,3]))
df=df.eval('x=2')  # this one is ok
df.eval('y="num"')  # here it will fail

The error that I get is:

ValueError: data type must provide an itemsize

What is the issue? How can i make it work?
It wasnt like this at older pandas versions...


I know that I can replace it with:

df['y']="num"
# or
df.assign(y='num')

But this is not the answer that I need...

I also tried replacing "num" with:

np.str_("num")

Which do has .itemsize, but it didn't help...

Note that when using df.query, with another content, gives me that same issue which I'm trying to solve here. I'm just assuming it's the same issue.


Solution

  • Use engine='python' parameter:

    print(df.eval('y="num"', engine='python'))
       a  x    y
    0  1  2  num
    1  2  2  num
    2  3  2  num