Search code examples
rpytorchtensortorchpairwise-distance

Torch custom pairwise distance


i'm working on big distance matrix (10-80k row ; 3k cols) and i want to get custom pairwise distance on that matrix ; and do it fast. I have trying with armadillo but with huge data it still "slow" I try with torch with cuda acceleration and with built in euclidean distance that realy so fast (100 times faster). So now i want to make custom pairwise distance like : for pairwise row (a and b): get the standard deviation of ai*bi (where i is cols) for example :

my_mat:
   |1    |2    |3    |4
a  |5    |3    |0    |4
b  |1    |6    |2    |3

a//b dist = std(5*1,3*6,0*2,4*3)
          = std(5,18,0,12)
          = 7.889867

i think about : start with my two dimension (N,M) tensor (my_mat) create a new tensor with 3 dimension (N,N,P) and in P dimension store a "list" with each pairwise product by cols :

3_dim_tens :

   |a                      |b
a  |Pdim(5*5,3*3,0*0,4*4)  |Pdim(5*1,3*6,0*2,4*3)
b  |Pdim(5*1,3*6,0*2,4*3)  |Pdim(5*5,3*3,0*0,4*4)

then if i reduce Pdim by std() i will have 2 dims (N,N) pairwise matrix with my custom distance. (typically is like matmul my_mat * t(my_mat) but with std in place of addition)

is it possible to do this with torch or is there another way for custom pairwise distance?


Solution

  • I think the most intuitive way is using einsum for this:

    import torch
    a = torch.tensor([[5.0, 3, 0, 4],[1, 6, 2, 3]])
    b = torch.einsum('ij,kj->ikj', a, a).std(dim=2)
    print(b)