Have two DataFrames with identical columns labels. Columns are label, data1, data2, ..., dataN
.
Need to take the product of the DataFrames, multiplying data1 * data1, data2 * data2, etc for every possible combination of rows in DataFrame1 with the rows in DataFrame2. As such, want the resulting DataFrame to maintain the label column of both frames in some way
Example:
Frame 1:
label | d1 | d2 | d3 |
---|---|---|---|
a | 1 | 2 | 3 |
b | 4 | 5 | 6 |
Frame 2:
label | d1 | d2 | d3 |
---|---|---|---|
c | 7 | 8 | 9 |
d | 10 | 11 | 12 |
Result:
label_1 | label_2 | d1 | d2 | d3 |
---|---|---|---|---|
a | c | 7 | 16 | 27 |
a | d | 10 | 22 | 36 |
b | c | 28 | 40 | 54 |
b | d | 40 | 55 | 72 |
I feel like there is a nice way to do this, but all I can come up with is gross loops with lots of memory reallocation.
Let's do a cross merge first then mutiple the dn_x
with dn_y
out = df1.merge(df2, how='cross')
out = (out.filter(like='label')
.join(out.filter(regex='d.*_x')
.mul(out.filter(regex='d.*_y').values)
.rename(columns=lambda col: col.split('_')[0])))
print(out)
label_x label_y d1 d2 d3
0 a c 7 16 27
1 a d 10 22 36
2 b c 28 40 54
3 b d 40 55 72