Search code examples
pythonpandasdataframedata-analysis

Fill missing values of a column by mapping the column values with a different column


The dataframe looks like this:

Year Month
2021 10
2021 11
2021 12
2022 1
2022 2
2022 3
NaN 1

I have to fill the null value in the year column by checking the year of the corresponding month column, i.e., if the month is 1,2 or 3, year should be 2022 or if the month is 10,11 or 12, year should be 2021.

For this example, i need to fill the NaN value of year with 2022. What is the most proficient way of using pandas to solve this?


Solution

  • You can use boolean indexing and numpy.select:

    m = df['Year'].isna()
    
    df.loc[m, 'Year'] = np.select([df.loc[m, 'Month'].between(1, 3),
                                   df.loc[m, 'Month'].between(10, 12)],
                                  [2022, 2021])
    # if you want integers
    df = df.convert_dtypes()
    

    Output:

       Year  Month
    0  2021     10
    1  2021     11
    2  2021     12
    3  2022      1
    4  2022      2
    5  2022      3
    6  2022      1