Search code examples
pythonpandasdataframedata-wranglingdata-munging

Creating a new column based on other columns from another dataframe


I have 2 dataframes:

df1

Name   Apples   Pears   Grapes   Peachs
James    3       5        5        2
Harry    1       0        2        9
Will     20      2        7        3

df2

Class   User   Factor  
A       Harry  3
A       Will   2
A       James  5
B       NaN    4

I want to create a new column in df2 called Total which is a list of all the columns for each user in df1, multiplied by the Factor for that user - this should only be done if they are in Class A.

This is how the final df should look

df2

Class   User   Factor   Total 
A       Harry  3        [3,0,6,27]
A       Will   2        [40,4,14,6]
A       James  5        [15,25,25,10]
B       NaN    4

This is what I tried:

df2['Total'] = list(df1.Name.isin((df2.User) and (df2.Class==A)) * df2.Factor)

Solution

  • This will do what your question asks:

    df2 = df2[df2.Class=='A'].join(df.set_index('Name'), on='User').set_index(['Class','User'])
    df2['Total'] = df2.apply(lambda x: list(x * x.Factor)[1:], axis=1)
    df2 = df2.reset_index()[['Class','User','Factor','Total']]
    

    Full test code:

    import pandas as pd
    import numpy as np
    df = pd.DataFrame(columns=[
    x.strip() for x in 'Name   Apples   Pears   Grapes   Peachs'.split()], data =[
    ['James',    3,       5,        5,        2],
    ['Harry',   1,       0,        2,        9],
    ['Will',     20,      2,        7,        3]])
    print(df)
    
    df2 = pd.DataFrame(columns=[
    x.strip() for x in 'Class   User   Factor'.split()], data =[
    ['A',       'Harry',  3],
    ['A',       'Will',   2],
    ['A',       'James',  5],
    ['B',       np.nan,    4]])
    print(df2)
    
    df2 = df2[df2.Class=='A'].join(df.set_index('Name'), on='User').set_index(['Class','User'])
    df2['Total'] = df2.apply(lambda x: list(x * x.Factor)[1:], axis=1)
    df2 = df2.reset_index()[['Class','User','Factor','Total']]
    print(df2)
    

    Input:

        Name  Apples  Pears  Grapes  Peachs
    0  James       3      5       5       2
    1  Harry       1      0       2       9
    2   Will      20      2       7       3
      Class   User  Factor
    0     A  Harry       3
    1     A   Will       2
    2     A  James       5
    3     B    NaN       4
    

    Output

      Class   User  Factor             Total
    0     A  Harry       3     [3, 0, 6, 27]
    1     A   Will       2    [40, 4, 14, 6]
    2     A  James       5  [15, 25, 25, 10]