I have broken a more complex problem into something simpler. The actual problem has larger lists and several more columns.
Starting with this df:
i | COL1 | COL2 | COL3 | COL4 | Revenue | QTY | Products
0 | Coin | Gold Krug | Gold Coin | Coins | 2333677473 | 21 | 12
1 | Gold Coin | Coins | Gold Coin | Coins | 2564774784 | 28 | 14
2 | Gold Coin | Coins | Gold Krug | Coins |3256666647 | 35 | 16
3 |Gold Coin | Coins | Coins |Gold Krug | 3456788 | 42 | 18
4 |Gold Krug | Gold Coin | Coins | Coins | 4588960 | 49 | 20
5 |Gold Coin | Coins | Gold Krug | Coins |346869909 |56 | 22
6 |Gold Coin | Coins | Gold Coin | Coins | 3777989 |63 | 24
7 |Gold Coin |Silver Krug |Gold Coin | Coins | 37687589 |70 | 26
8 |Gold Coin | Coins |Gold Coin | Coins | 45789889 |77 | 28
9 |Gold Coin | Gold Krug |Gold Coin |Coins | 468 |84 | 30
I would like output to be DF with a new column like this:
i | Category | Revenue | QTY |Products
0 |Gold Krug | 2333677473 |21 | 12
2 |Gold Krug | 3256666647 | 35 | 16
3 |Gold Krug | 3456788 | 42 | 18
4 | Gold Krug | 4588960 | 49 | 20
5 | Gold Krug | 346869909 | 56 | 22
7 | Silver Krug | 37687589 | 70 | 26
9 | Gold Krug | 468 | 84 | 30
I used this, but simply do not understand how to create a new column using the value in the list that matches for the new column :
KRUG = ['Gold Krug', 'Silver Krug', 'Gold Maple','Gold Eagle']
df = df[df[['COL1', 'COL2', 'COL3', 'COL4 ']].isin(KRUG).any(axis=1)]
print(df)
output :
i |COL1 |COL2 |COL3 |COL4 |Revenue |QTY |Products
0 |Coin |Gold Krug |Gold Coin |Coins |2333677473 |21 |12
2 |Gold Coin |Coins |Gold Krug |Coins |3256666647 |35 |16
3 |Gold Coin |Coins |Coins |Gold Krug |3456788 |42 |18
4 |Gold Krug |Gold Coin |Coins |Coins |4588960 |49 |20
5 |Gold Coin |Coins |Gold Krug |Coins |346869909 |56 |22
7 |Gold Coin |Silver Krug |Gold Coin |Coins |37687589 |70 |26
9 |Gold Coin |Gold Krug |Gold Coin |Coins |468 |84 |30
Split the search into two sections, and concatenate:
category = (df.filter(like='COL')
.agg(','.join, axis = 1)
.str.extract(fr"({'|'.join(KRUG)})")
.dropna()
.set_axis(['category'], axis = 'columns')
)
others = df.loc[df.filter(like='COL').isin(KRUG).any(1),
['Revenue', 'QTY', 'Products']]
pd.concat([category, others], axis = 'columns')
category Revenue QTY Products
0 Gold Krug 2333677473 21 12
2 Gold Krug 3256666647 35 16
3 Gold Krug 3456788 42 18
4 Gold Krug 4588960 49 20
5 Gold Krug 346869909 56 22
7 Silver Krug 37687589 70 26
9 Gold Krug 468 84 30