Search code examples
pythonpython-3.xpandas

Outer product on Pandas DataFrame rows


Have two DataFrames with identical columns labels. Columns are label, data1, data2, ..., dataN. Need to take the product of the DataFrames, multiplying data1 * data1, data2 * data2, etc for every possible combination of rows in DataFrame1 with the rows in DataFrame2. As such, want the resulting DataFrame to maintain the label column of both frames in some way

Example:

Frame 1:

label d1 d2 d3
a 1 2 3
b 4 5 6

Frame 2:

label d1 d2 d3
c 7 8 9
d 10 11 12

Result:

label_1 label_2 d1 d2 d3
a c 7 16 27
a d 10 22 36
b c 28 40 54
b d 40 55 72

I feel like there is a nice way to do this, but all I can come up with is gross loops with lots of memory reallocation.


Solution

  • Let's do a cross merge first then mutiple the dn_x with dn_y

    out = df1.merge(df2, how='cross')
    out = (out.filter(like='label')
           .join(out.filter(regex='d.*_x')
                 .mul(out.filter(regex='d.*_y').values)
                 .rename(columns=lambda col: col.split('_')[0])))
    
    print(out)
    
      label_x label_y  d1  d2  d3
    0       a       c   7  16  27
    1       a       d  10  22  36
    2       b       c  28  40  54
    3       b       d  40  55  72