Search code examples
pythonpandasnumpyvectorization

Function with try except in np.vectorize returns error message


I have a, vectorized, function that does a simple adjustment on a number

import pandas as pd
import numpy as np

@np.vectorize
def adjust_number(number: int) -> int:

    max_number = 6
    default_substitue = 2

    # Try to convert to int, if not possible, use default_substitue
    try:
        number = int(number)
    except:
        number = default_substitue

    return min(number, max_number)

I apply the function on a dataframe

df = pd.DataFrame({'numbers': [1.0, 9.0, np.nan]})
df = df.assign(adjusteded_number=lambda x: adjust_number(x['numbers']))

This returns the expected outputs, but I also get a strange return message

c:\Users\xxx\AppData\Local\Programs\Python\Python310\lib\site-packages\numpy\lib\function_base.py:2412: RuntimeWarning: invalid value encountered in adjust_number (vectorized)
outputs = ufunc(*inputs)

It is not a huge issue, but it is very annoying. The error seems to be triggered by the try-except. If I modify the function, removing the try-except, which I really cannot do without breaking the functionality, the error goes away.

What is causing this and how can I get rid of the error message?


Solution

  • If it is NaN's/infinities that you are worried about, you use the NumPy isfinite function to check for these:

    @np.vectorize
    def adjust_number(number: int) -> int:
        max_number = 6
        default_substitue = 2
    
        # Try to convert to int, if not possible, use default_substitue
        if np.isfinite(number):
            number = int(number)
        else:
            number = default_substitue
    
        return min(number, max_number)
    

    If you also want to make sure the the number was also actually an integer even if it's held as a float, you could do:

    @np.vectorize
    def adjust_number(number: int) -> int:
        max_number = 6
        default_substitue = 2
    
        # Try to convert to int, if not possible, use default_substitue
        if np.isfinite(number):
            # make sure number is integer
            if isinstance(number, int) or (isinstance(number, float) and number.is_integer()):
                number = int(number)
            else:
                default_substitute
        else:
            number = default_substitue
    
        return min(number, max_number)
    

    Alternatively, you don't even need to use vectorize, and could instead do:

    def adjust_number(number):
        default_substitute = 2
        max_number = 6
        num = np.asarray(number)  # make sure input is array
        num[~np.isfinite(num)] = default_substitute
        return np.clip(num, a_min=None, a_max=max_number).astype(int)