I have two df and want to merge them because I need one df for dashboard. My problem is that my data has no unique key and all the data points are repeating. For example I have df1 like this:
Web Obj
A ObJA
A ObjB
B ObjA
B ObjD
B ObjA
df2
Web Lab Cat
A LabA F
A LabB F
A LabF F
A LabA F
B LabG G
B LabD G
I want to merge them both but I have no key... I was thinking that I can create Web's each item 6 times as values will not exceed more than 6... and then add values one by one... the extra Web value show "NaN" So the output would look like this
Web Lab Cat Obj
A LabA F ObJA
A LabB F ObjB
A LabF F
A LabA F
A
A
B LabG G ObjA
B LabD G ObjD
B ObjA
B
B
B
Or any other way... ?
You could create a key
column and do the merge
that way:
df2 = (df2.assign(key=df2['Web'] + (df2.groupby('Web').cumcount() + 1).astype(str))
.merge(df1.assign(key=df1['Web'] + (df1.groupby('Web').cumcount() + 1).astype(str))
.drop('Web', axis=1),
on='key', how='outer'))
df2['Web']=df2['Web'].ffill()
df2
Out[1]:
Web Lab Cat key Obj
0 A LabA F A1 ObJA
1 A LabB F A2 ObjB
2 A LabF F A3 NaN
3 A LabA F A4 NaN
4 B LabG G B1 ObjA
5 B LabD G B2 ObjD
6 B NaN NaN B3 ObjA