I have a function
that needs to return multiple values:
def max_dd(ser):
...
compute i,j,dd
return i,j,dd
if I have code like this that calls this function passing in a series
:
date1, date2, dd = df.rolling(window).apply(max_dd)
however, I get an error:
pandas.core.base.DataError: No numeric types to aggregate
If I return a single value from max_dd
, everything is fine. How do I return multiple values from a function that has been "apply
"?
Rolling apply can only produce single numeric values. There is no support for multiple returns or even nonnumeric returns (like something as simple as a string) from rolling apply. Any answer to this question will be a work around.
That said, a viable workaround is to take advantage of the fact that rolling
objects are iterable (as of pandas 1.1.0
).
What’s new in 1.1.0 (July 28, 2020)
Meaning that it is possible to take advantage of the faster grouping and indexing operations of the rolling function, but obtain more flexible behaviour with python:
def some_fn(df_):
"""
When iterating over a rolling window it disregards the min_periods
argument of rolling and will produce DataFrames for all windows
The input is also of type DataFrame not Series
You are completely responsible for doing all operations here,
including ignoring values if the input is not of the correct shape
or format
:param df_: A DataFrame produced by rolling
:return: a column joined, and the max value within the window
"""
return ','.join(df_['a']), df_['a'].max()
window = 5
results = pd.DataFrame([some_fn(df_) for df_ in df.rolling(window)])
Sample DataFrame and output:
df = pd.DataFrame({'a': list('abdesfkm')})
df
:
a
0 a
1 b
2 d
3 e
4 s
5 f
6 k
7 m
result
:
0 1
0 a a
1 a,b b
2 a,b,d d
3 a,b,d,e e
4 a,b,d,e,s s
5 b,d,e,s,f s
6 d,e,s,f,k s
7 e,s,f,k,m s