Search code examples
pythonpandasfunctionmergenonetype

Pandas fillna with method=None (default value) raises an error


I am writing a function to aid DataFrame merges between two tables. The function creates a mapping key in the first DataFrame using variables in the second DataFrame.

My issue arises when I try to include the .fillna(method=) at the end of the function.

# Import libraries
import pandas as pd

# Create data
data_1 = {"col_1": [1, 2, 3, 4, 5], "col_2": [1, , 3, , 5]}
data_2 = {"col_1": [1, 2, 3, 4, 5], "col_3": [1, , 3, , 5]}

df = pd.DataFrame(data_1)
df2 = pd.DataFrame(data_2)

def merge_on_key(df, df2, join_how="left", fill_na=None):
    # Import libraries
    import pandas as pd

    # Code to create mapping key not required for question

    # Merge the two dataframes
    print(fill_na)
    print(type(fill_na))
    df3 = pd.merge(df, df1, how=join_how, on="col_1").fillna(method=fill_na)

    return df3

df3 = merge_on_key(df, df2)

output:
>>> None
>>> <class 'NoneType'>

error message:
ValueError: Must specify a fill 'value' or 'method'

My question is why does the fill_na, which is equal to None, not allow the fillna(method=None, the default value for fillna(method))?


Solution

  • You have to either use a 'value' or a 'method'. In your call to fillna you are setting both of them to None. In short, you're telling Python to fill empty (None) values in the dataframe with None, which does nothing and thus it raises an exception.

    Based on the docs (link), you could either assign a non-empty value:

    df3 = pd.merge(df, df1, how=join_how, on="col_1").fillna(value=0, method=fill_na)
    

    or change the method from None (which means "directly substitute the None values in the dataframe by the given value) to one of {'backfill', 'bfill', 'pad', 'ffill'} (each documented in the docs):

    df3 = pd.merge(df, df1, how=join_how, on="col_1").fillna( method='backfill')