Search code examples
pythonpandasdataframemapping

Fill values based on reference table


I have two df with different lenghts, one (reference/dictionary) with various types of products and brand who makes them, and another df with just the products, I have to fill brand based on the reference table

Reference df is like

Product Brand
Shirt Zara
Sneakers Nike
Pants Zara
Tennis ball Wilson
Football ball Adidas
Football boots Adidas

The df to be filled is something like

Product Brand
Shirt NaN
Shirt NaN
Pants NaN
Tennis ball NaN
Shirt NaN
Football boots NaN
Football boots NaN
Football boots NaN
Pants NaN
Sneakers NaN
Football ball NaN
Football boots NaN
(+100k rows) NaN

I´ve tried the following code

df['Brand'] = df['Brand'].map(referencedf.set_index('Product')['Brand'])
df

Unfortunately, the output is not the desired and it always returns the same brand for all products

Product Brand
Shirt Zara
Shirt Zara
Pants Zara
Tennis ball Zara
Shirt Zara
Football boots Zara
Football boots Zara
Football boots Zara
Pants Zara
Sneakers Zara
Football ball Zara
Football boots Zara
(+100k rows) Zara

Any ideas on how can i get the output right?


Solution

  • Try:

    df['Brand'] = df['Product'].map(reference_df.set_index('Product')['Brand'])
    print(df)
    

    Prints:

               Product   Brand
    0            Shirt    Zara
    1            Shirt    Zara
    2            Pants    Zara
    3      Tennis ball  Wilson
    4            Shirt    Zara
    5   Football boots  Adidas
    6   Football boots  Adidas
    7   Football boots  Adidas
    8            Pants    Zara
    9         Sneakers    Nike
    10   Football ball  Adidas
    11  Football boots  Adidas