I have 2 dataframes, both with around 30k rows and 8 columns, and I need to subtract the values of every row in the first df from values of every row in second df(to compute the Euclidian distance between every pair of rows) which will probably result in a 3d structure of only the differences between every pair of rows. I've tried several approaches but each one takes a very long time to complete. Is there an efficient way to do this?
For what is worth, your Cartesian product can be done as follows:
import pandas as pd
df1 = pd.DataFrame({'A': [1,2,3]})
df2 = pd.DataFrame({'B': [4,5,6]})
df3 = pd.merge(df1.assign(key=1), df2.assign(key=1), on='key').drop('key', axis=1)
df3
# A B
#0 1 4
#1 1 5
#2 1 6
#3 2 4
#4 2 5
#5 2 6
#6 3 4
#7 3 5
#8 3 6