Search code examples
pythonpandasmergekey

Merging two df without any key


I have two df and want to merge them because I need one df for dashboard. My problem is that my data has no unique key and all the data points are repeating. For example I have df1 like this:

Web       Obj
A         ObJA
A         ObjB
B         ObjA
B         ObjD
B         ObjA

df2

Web       Lab        Cat
A         LabA       F
A         LabB       F
A         LabF       F
A         LabA       F
B         LabG       G
B         LabD       G

I want to merge them both but I have no key... I was thinking that I can create Web's each item 6 times as values will not exceed more than 6... and then add values one by one... the extra Web value show "NaN" So the output would look like this

  Web        Lab        Cat            Obj
    A         LabA       F             ObJA
    A         LabB       F             ObjB
    A         LabF       F
    A         LabA       F
    A
    A
    B         LabG       G             ObjA
    B         LabD       G             ObjD
    B                                  ObjA
    B
    B
    B

Or any other way... ?


Solution

  • You could create a key column and do the merge that way:

    df2 = (df2.assign(key=df2['Web'] + (df2.groupby('Web').cumcount() + 1).astype(str))
              .merge(df1.assign(key=df1['Web'] + (df1.groupby('Web').cumcount() + 1).astype(str))
              .drop('Web', axis=1),
               on='key', how='outer'))
    df2['Web']=df2['Web'].ffill()
    df2
    Out[1]: 
      Web   Lab  Cat key   Obj
    0   A  LabA    F  A1  ObJA
    1   A  LabB    F  A2  ObjB
    2   A  LabF    F  A3   NaN
    3   A  LabA    F  A4   NaN
    4   B  LabG    G  B1  ObjA
    5   B  LabD    G  B2  ObjD
    6   B   NaN  NaN  B3  ObjA