Search code examples

df.apply(hurst_function) gave TypeError: must be real number, not tuple in, Python

I have a column in form of a data-frame that contains the ratio of some numbers. On that df col, I want to apply hurst function using df.apply() method.

I don't know if the error is with the df.apply or with the hurst_function. Consider the code which calculates hurst exponent on a col using the df.apply method:

import hurst 

def hurst_function(df_col_slice):
    return hurst.compute_Hc(df_col_slice)

def func(df_col):
    results = round(df_col.rolling(101).apply(hurst_function)[100:],1)
    return results


I get the error:

Input In [73], in func(df_col)
---> 32     results = round(df_col.rolling(101).apply(hurst_function)[100:],1)

File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\window\, in Rolling.apply(self, func, raw, engine, engine_kwargs, args, kwargs)
   1822 @doc(
   1823     template_header,
   1824     create_section_header("Parameters"),
   1841     kwargs: dict[str, Any] | None = None,
   1842 ):
-> 1843     return super().apply(
   1844         func,
   1845         raw=raw,
   1846         engine=engine,
   1847         engine_kwargs=engine_kwargs,
   1848         args=args,
   1849         kwargs=kwargs,
   1850     )

File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\window\, in RollingAndExpandingMixin.apply(self, func, raw, engine, engine_kwargs, args, kwargs)
   1312 else:
   1313     raise ValueError("engine must be either 'numba' or 'cython'")
-> 1315 return self._apply(
   1316     apply_func,
   1317     numba_cache_key=numba_cache_key,
   1318     numba_args=numba_args,
   1319 )

File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\window\, in BaseWindow._apply(self, func, name, numba_cache_key, numba_args, **kwargs)
    587     return result
    589 if self.method == "single":
--> 590     return self._apply_blockwise(homogeneous_func, name)
    591 else:
    592     return self._apply_tablewise(homogeneous_func, name)

File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\window\, in BaseWindow._apply_blockwise(self, homogeneous_func, name)
    437 """
    438 Apply the given function to the DataFrame broken down into homogeneous
    439 sub-frames.
    440 """
    441 if self._selected_obj.ndim == 1:
--> 442     return self._apply_series(homogeneous_func, name)
    444 obj = self._create_data(self._selected_obj)
    445 if name == "count":
    446     # GH 12541: Special case for count where we support date-like types

File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\window\, in BaseWindow._apply_series(self, homogeneous_func, name)
    428 except (TypeError, NotImplementedError) as err:
    429     raise DataError("No numeric types to aggregate") from err
--> 431 result = homogeneous_func(values)
    432 return obj._constructor(result, index=obj.index,

File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\window\, in BaseWindow._apply.<locals>.homogeneous_func(values)
    579     return func(x, start, end, min_periods, *numba_args)
    581 with np.errstate(all="ignore"):
--> 582     result = calc(values)
    584 if numba_cache_key is not None:
    585     NUMBA_FUNC_CACHE[numba_cache_key] = func

File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\window\, in BaseWindow._apply.<locals>.homogeneous_func.<locals>.calc(x)
    571 start, end = window_indexer.get_window_bounds(
    572     num_values=len(x),
    573     min_periods=min_periods,
    575     closed=self.closed,
    576 )
    577 self._check_window_bounds(start, end, len(x))
--> 579 return func(x, start, end, min_periods, *numba_args)

File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\window\, in RollingAndExpandingMixin._generate_cython_apply_func.<locals>.apply_func(values, begin, end, min_periods, raw)
   1339 if not raw:
   1340     # GH 45912
   1341     values = Series(values, index=self._on)
-> 1342 return window_func(values, begin, end, min_periods)

File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\_libs\window\aggregations.pyx:1315, in pandas._libs.window.aggregations.roll_apply()

TypeError: must be real number, not tuple

What can I do to solve this?

Edit: display(df_col_slice) is giving the following output:

0      0.282043
1      0.103355
2      0.537766
3      0.491976
4      0.535050
96     0.022696
97     0.438995
98    -0.131486
99     0.248250
100    1.246463
Length: 101, dtype: float64


  • hurst.compute_Hc function returns a tuple of 3 values:

    H, c, vals = compute_Hc(df_col_slice)

    where H is the Hurst exponent , and c - is some constant.

    But, pandas._libs.window.aggregations.roll_apply() expects its argument (function) to return a single (scalar) which is the reduced result of a rolling window.

    That's why your hurst_function function need to return a certain value from vals.