I have the following xlsx file that I need to work on:
I want to iterate through the dataframe and if the column ITEM CODE contains a dictionary key, I want to check on the same row if contains a dictionary value[0] (first position in the tuple) and if contains I want to insert dictionary value1 (second position in the tuple) into another column named SKU
Dataframe: #df3 = df2.append(df1)
catp = {"2755":(('24','002'),('25','003'),('26','003'),('27','004'),('28','005'),('29','006'),('30','007'),('31','008'),
('32','009'),('32','010'),('33','011'),('34','012'),('35','013'),('36','014')),
"2513":(('38','002'),('40','003'),('42','004'),('44','005'),('46','006'),('48','007'),('50','008'),('52','009'),
('54','010'))}
for i, row in df3.iterrows():
if catp.key() in df3['ITEM CODE'][i] and catp.value()[0] in df3['TG'][i]:
codmarime = catp.value()[1]
df3['SKU'][i] = '20'+df3['ITEM CODE'][i]+[i]+codmarime
else:
df3['SKU'][i] = '20'+df3['ITEM CODE'][i]+'???'
If 2755 and 24 found SKU = '202755638002'
If 2513 and 44 found SKU = '202513123005'
As you failed to provide text data to create at least a fragment of your DataFrame, I copied from your picture 3 rows, creating my test DataFrame:
df3 = pd.DataFrame(data=[
[ '1513452', 'AVRO D2', '685', 'BLACK/BLACK/ANTRACITE', '24', 929.95, '8052644627565' ],
[ '2513452', 'AVRO D2', '685', 'BLACK/BLACK/ANTRACITE', '21', 929.95, '8052644627565' ],
[ '2755126', 'AMELIA', 'Y17', 'DARK-DENIM', '24', 179.95, '8052644627565' ]],
columns=[ 'ITEM CODE', 'ITEM', 'COLOR', 'COLOR CODE', 'TG', 'PRICE', 'EAN' ])
Details:
catp
keys in ITEM CODE
column.ITEM CODE
contains one of your codes (2513) but for TG
column no tuple saved under 2513
key contains first element == 21.ITEM CODE
contains one of your codes (2755), TG
== 24
and among tuples saved under 2755
there is one == 24.Then we have to define a couple of auxiliary functions:
def findContainedCodeAndVal(dct, str):
for eachKey in dct.keys():
if str.find(eachKey) >= 0:
return (eachKey, dct[eachKey])
else:
return (None, None)
This function attempts to find in dct
a key contained in str
.
It returns a 2-tuple containing the key found and associated value from dct
.
def find2ndElem(tuples, str):
for tpl in tuples:
if tpl[0] == str:
return tpl[1]
else:
return ''
This function checks each tuple from tuples
whether its first element
== str
and returns the second element from this tuple.
And the last function to define is a function to be applied to each row
from your DataFrame. It returns the value to be saved in SKU
column:
def fn(row):
ind = row.name # Read row index
iCode = row['ITEM CODE']
k, val = findContainedCodeAndVal(catp, iCode)
codmarime = ''
if k:
tg = row.TG
codmarime = find2ndElem(val, tg)
if codmarime == '':
codmarime = '???'
return f'20/{iCode}/{ind}/{codmarime}'
Note that it uses your catp
dictionary.
For demonstration purposes, I introduced in the returned value additional slashes, separating adjacent parts. In the target version remove them.
And the last thing to do is to compute SKU
column of your DataFrame,
applying fn
function to each row of df3
and saving the result under
SKU
column:
df3['SKU'] = df3.apply(fn, axis=1)
When you print the DataFrame (containig my test data), SKU
column will
contain:
20/1513452/0/???
20/2513452/1/???
20/2755126/2/002