Search code examples
pythonpandasdataframemany-to-one

Joining Dataframes Many to One


I have two dataframes, one with info about users and the other about items transactions that I'd like to join together.

The users df have a column with last Trans Ref, items have a Trans Ref too, but as each user buys many things there is a many-to-one relationship.

Say I had user A, B, C with trans refs 123, 456, 789

Then I had transactions with references:

123-001, 123-002, 123-003, 124-004
456-001,
789-001, 789-002, 789-003

I can cut the item number off the ends of my trans refs and match them to users (many-to-one)

How can this be done in a Pandas Dataframe?


Solution

  • Setup:

    Users dataframe

    users_df = pd.DataFrame({'UserID':['A','B','C'],'Trans Ref':[123,456,789]})
    
       Trans Ref UserID
    0        123      A
    1        456      B
    2        789      C
    

    Transaction dataframe

    trans_df = pd.DataFrame({'Tran Refs':[['123-001','123-002','123-002','123-004'],
                                          ['456-001'],['789-001','789-002','789-003']],
                             'Trans Description':['Transaction Info 123',
                                                  'Transaction Info 456',
                                                  'Transaction Info 789']})
    
                                  Tran Refs     Trans Description
    0  [123-001, 123-002, 123-002, 123-004]  Transaction Info 123
    1                             [456-001]  Transaction Info 456
    2           [789-001, 789-002, 789-003]  Transaction Info 789
    

    Reshape trans_df and merge with users_df many to one.

    df_out = (trans_df.set_index('Trans Description')['Tran Refs']
            .apply(lambda x:pd.Series(x))
            .stack()
            .str.split('-').str[0]  #trim -00x from trans ref
            .astype(int)
            .reset_index(name='Trans Ref')
            .drop('level_1',axis=1)
            .merge(users_df, on='Trans Ref'))  #join to users_df on Trans Ref
    

    Output:

          Trans Description  Trans Ref UserID
    0  Transaction Info 123        123      A
    1  Transaction Info 123        123      A
    2  Transaction Info 123        123      A
    3  Transaction Info 123        123      A
    4  Transaction Info 456        456      B
    5  Transaction Info 789        789      C
    6  Transaction Info 789        789      C
    7  Transaction Info 789        789      C