Search code examples
pythonpandasnumpy

Use np.nan without importing numpy


I am used to replacing empty string with NaN and dropping to remove empty data.

import pandas as pd
import numpy as np

df.replace('', np.nan).dropna()

However, I want my function to run using serverless framework. I need to import numpy just to use np.nan, which eats up my precious 250MB limit for package size.

Importing pd.np.nan works, but there is warning that pandas.np module is deprecated and will be removed from a future version of pandas.

Is there any solution to use np.nan without importing numpy?


Solution

  • Use pd.NA instead.

    From the Docs:

    Starting from pandas 1.0, an experimental pd.NA value (singleton) is available to represent scalar missing values. At this moment, it is used in the nullable integer, boolean and dedicated string data types as the missing value indicator. The goal of pd.NA is provide a “missing” indicator that can be used consistently across data types (instead of np.nan, None or pd.NaT depending on the data type).