I have a dataframe with pyarrow dtypes such as `duration[ns][pyarrow]'.
Using good old numpy dtypes, I can get the seconds using
foo['DURATION_NEW'].dt.total_seconds()
but pyarrow equivalent gives me an AttributeError:
Can only use .dt accessor with datetimelike values
Sadly, the usually helpful pandas documentation is rather short regarding pyarrow-dtype differences. https://pandas.pydata.org/docs/user_guide/pyarrow.html
Furthermore, I couldn't find a (official/helpful) migration guide from numpy to pyarrow backend covering this case. I am using pandas 2.1.3.
The error can be triggered with the example below :
df = pd.DataFrame(
{"DURATION_NEW": [pd.Timedelta(minutes=60), pd.Timedelta(seconds=1000)]},
dtype=pd.ArrowDtype(pa.duration("ns"))
)
df.dtypes
#DURATION_NEW duration[ns][pyarrow]
df["DURATION_NEW"].dt.total_seconds()
# AttributeError: Can only use .dt accessor with datetimelike values
But unfortunately, there is an open issue (see apache/arrow#33962) because pyarrow can't compute functions for timedeltas yet (see pandas-dev/pandas#52284).
As a workaround, you can try using apply
/ Timedelta.seconds
:
df["DURATION_NEW"].apply(lambda td: td.seconds)
# 0 3600
# 1 1000
# Name: DURATION_NEW, dtype: int64
Or with a listcomp :
[td.seconds for td in df["DURATION_NEW"]]
# [3600, 1000]