I have a dataframe like as below
sample_df = pd.DataFrame({'single_proj_name': [['jsfk'],['fhjk'],['ERRW'],['SJBAK']],
'single_item_list': [['ABC_123'],['DEF123'],['FAS324'],['HSJD123']],
'single_id':[[1234],[5678],[91011],[121314]],
'multi_proj_name':[['AAA','VVVV','SASD'],['QEWWQ','SFA','JKKK','fhjk'],['ERRW','TTTT'],['SJBAK','YYYY']],
'multi_item_list':[[['XYZAV','ADS23','ABC_123'],['ABC_123','ADC_123']],['XYZAV','DEF123','ABC_123','SAJKF'],['QWER12','FAS324'],['JFAJKA','HSJD123']],
'multi_id':[[[2167,2147,29481],[5432,1234]],[2313,57567,2321,7898],[1123,8775],[5237,43512]]})
I would like to do the below
a) Pick the value from single_item_list
for each row
b) search that value in multi_item_list
column of the same row. Please note that it could be list of lists
for some of the rows
c) If match found, keep only that matched values in multi_item_list
and remove all other non-matching values from multi_item_list
d) Based on the position of the match item, look for corresponding value in multi_id
list and keep only that item. Remove all other position items from the list
So, I tried the below but it doesn't work for nested list of lists
for a, b, c in zip(sample_df['single_item_list'],sample_df['multi_item_list'],sample_df['multi_id']):
for i, x in enumerate(b):
print(x)
print(a[0])
if a[0] in x:
print(x.index(a[0]))
pos = x.index(a[0])
print(c[pos-1])
I expect my output to be like as below. In real world, I will have more cases like 1st input row (nested lists with multiple levels)
Here is one approach which works with any number of nested lists:
def func(z, X, Y):
A, B = [], []
for x, y in zip(X, Y):
if isinstance(x, list):
a, b = func(z, x, y)
A.append(a), B.append(b)
if x == z:
A.append(x), B.append(y)
return A, B
c = ['single_item_list', 'multi_item_list', 'multi_id']
df[c[1:]] = [func(z, X, Y) for [z], X, Y in df[c].to_numpy()]
Result
single_proj_name single_item_list single_id multi_proj_name multi_item_list multi_id
0 [jsfk] [ABC_123] [1234] [AAA, VVVV, SASD] [[ABC_123], [ABC_123]] [[29481], [5432]]
1 [fhjk] [DEF123] [5678] [QEWWQ, SFA, JKKK, fhjk] [DEF123] [57567]
2 [ERRW] [FAS324] [91011] [ERRW, TTTT] [FAS324] [8775]
3 [SJBAK] [HSJD123] [121314] [SJBAK, YYYY] [HSJD123] [43512]