I have two df with different lenghts, one (reference/dictionary) with various types of products and brand who makes them, and another df with just the products, I have to fill brand based on the reference table
Reference df is like
Product | Brand |
---|---|
Shirt | Zara |
Sneakers | Nike |
Pants | Zara |
Tennis ball | Wilson |
Football ball | Adidas |
Football boots | Adidas |
The df to be filled is something like
Product | Brand |
---|---|
Shirt | NaN |
Shirt | NaN |
Pants | NaN |
Tennis ball | NaN |
Shirt | NaN |
Football boots | NaN |
Football boots | NaN |
Football boots | NaN |
Pants | NaN |
Sneakers | NaN |
Football ball | NaN |
Football boots | NaN |
(+100k rows) | NaN |
I´ve tried the following code
df['Brand'] = df['Brand'].map(referencedf.set_index('Product')['Brand'])
df
Unfortunately, the output is not the desired and it always returns the same brand for all products
Product | Brand |
---|---|
Shirt | Zara |
Shirt | Zara |
Pants | Zara |
Tennis ball | Zara |
Shirt | Zara |
Football boots | Zara |
Football boots | Zara |
Football boots | Zara |
Pants | Zara |
Sneakers | Zara |
Football ball | Zara |
Football boots | Zara |
(+100k rows) | Zara |
Any ideas on how can i get the output right?
Try:
df['Brand'] = df['Product'].map(reference_df.set_index('Product')['Brand'])
print(df)
Prints:
Product Brand
0 Shirt Zara
1 Shirt Zara
2 Pants Zara
3 Tennis ball Wilson
4 Shirt Zara
5 Football boots Adidas
6 Football boots Adidas
7 Football boots Adidas
8 Pants Zara
9 Sneakers Nike
10 Football ball Adidas
11 Football boots Adidas