Search code examples
pythonpandasnumpymatplotlibperfplot

Perfplot bench() raises "TypeError: ufunc 'isfinite' not supported for the input types, and the input types"


I am using perpflot library to test the effect of DatetimeIndex on searching for a pandas dataframe.

I have defined a setup function to cretate 2 dataframes. One with datetime index and other with time as a column. I have also defined 2 functions which uses .loc in index and on column respectively and returns the subdata. However, it shows me a typeError.

TypeError: ufunc 'isfinite' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

Code:

import numpy as np
import pandas as pd
from datetime import datetime
import perfplot


def setup_code(n):
    timeline = pd.date_range(end=datetime.now(), freq='1s', periods=n)
    sensor_readings = np.random.randint(100, size=(n, 4))
    col_labels = ['Sensor1', 'Sensor2', 'Sensor3', 'Sensor4']
    data = pd.DataFrame(sensor_readings, columns=col_labels)
    data['time'] = timeline
    data['time'] = pd.to_datetime(data['time'])
    data2 = data.copy()
    data2 = data2.set_index('time')
    print(n)
    return [data, data2]


def f1(ldata):
    data = ldata[0]
    subdata = data.loc[(data['time'] >= '2019-06-21 08:00:00') & (data['time'] <= '2019-06-21 11:00:00')]
    return subdata


def f2(ldata):
    data = ldata[1]
    subdata = data.loc['2019-06-21 04:00:00':'2019-06-21 10:00:00']
    return subdata


out = perfplot.bench(
    setup=setup_code,  
    kernels=[
        f1, f2
    ],
    n_range=[1000 ** k for k in range(1, 3)],
    labels=['Without Indexing', 'With Indexing'],
    xlabel='Length of DataFrame'
)
out.show()

Traceback:

Traceback (most recent call last):                                                                                                | 0/2 [00:00<?, ?it/s]
  File ".\scratchpad.py", line 39, in <module>
    xlabel='Length of DataFrame'
  File "C:\Users\hpandya\AppData\Local\Continuum\anaconda3\lib\site-packages\perfplot\main.py", line 128, in bench
    reference, kernel(data)
  File "C:\Users\hpandya\AppData\Local\Continuum\anaconda3\lib\site-packages\numpy\core\numeric.py", line 2423, in allclose
    res = all(isclose(a, b, rtol=rtol, atol=atol, equal_nan=equal_nan))
  File "C:\Users\hpandya\AppData\Local\Continuum\anaconda3\lib\site-packages\numpy\core\numeric.py", line 2521, in isclose
    xfin = isfinite(x)
TypeError: ufunc 'isfinite' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

It is weird that it is showing error on the line where I have defined the xlabel. I fee like I am missing something trivial here.


Solution

  • The bench() and show() methods by default compare the kernel outputs to ensure that all the methods produce the same output (for correctness). The check is done using numpy functions which may not apply to all cases or all kernel outputs.

    What you want to do is specify an equality_check argument, which allows some flexibility in how the output is compared. This is especially useful when comparing things such as iterables of strings or dictionaries, which numpy cannot handle well.

    Set equality_check to None if you're confident your functions are correct, or otherwise pass some callable which implements your own checking logic.

    out = perfplot.bench(
        ...
        equality_check=lambda x, y: x.equals(y)  # equality_check=None
    )
    

    See this answer (scroll to the bottom) for more examples of how equality_check has been used for timing different functions.