Search code examples
pythonpandascategoriesregion

Add a category column to a pandas dataframe from an existing list of categories


I have a pandas data frame as such:

Country_Name    Date    Population  
Afghanistan 7/1/2000    25950816
Afghanistan 7/1/2010    34385068
Albania     7/1/2000    3071856
Albania     7/1/2010    3204284
Algeria     7/1/2000    30533827
Algeria     7/1/2010    35468208
...

I also have another dataframe with region data:

Region  Country
Asia    Afghanistan
Europe  Albania
Africa  Algeria
Europe  Andorra
Africa  Angola
...

I am trying to add a column to my first dataframe that adds the proper region category to each country row. I don't have code because I'm not sure where to begin.

Thanks


Solution

  • Assuming df1 is your first dataframe and df2 is your second one, you can merge on the country and perform a left join, you need to rename the country column on df2 first though:

    df2.rename(columns={'Country':'Country_Name'}, inplace=True)
    merged = df1.merge(df2, on='Country_Name', how='left')
    
    # you can either use merged dataframe or assign it to df1:
    df1 = merged
    

    or you can assign it back to df1 instead of to merged:

    df1 = df1.merge(df2, on='Country_Name', how='left')
    

    Should give you what you want