Search code examples
matlabperformancemeaneuclidean-distanceintegral

High performance computation of mean 2D-Euclidian-distance


Let's say I have a position matrix P with dimensions 10x2, where the first column contains x values and second column the corresponding y values. I want the mean of the lengths of the positions. The way I have done this so far is with the following code:

avg = sum( sqrt( P(:,1).^2 + P(:,2).^2))/10);

I've been told that the integral function integral2 is much faster and more precise for this task. How can I use integral2 to compute the mean value?


Solution

  • Just so this question doesn't remain unanswered:

    function q42372466(DO_SUM)
    if ~nargin % nargin == 0
      DO_SUM = true;
    end
    
    % Generate some data:
    P = rand(2E7,2);
    
    % Correctness:
    R{1} = m1(P);
    R{2} = m2(P);
    R{3} = m3(P);
    R{4} = m4(P);
    R{5} = m5(P);
    R{6} = m6(P);
    
    for ind1 = 2:numel(R)
      assert(abs(R{1}-R{ind1}) < 1E-10);
    end
    
    % Benchmark:
    t(1) = timeit(@()m1(P));
    t(2) = timeit(@()m2(P));
    t(3) = timeit(@()m3(P));
    t(4) = timeit(@()m4(P));
    t(5) = timeit(@()m5(P));
    t(6) = timeit(@()m6(P));
    
    % Print timings:
    disp(t);
    
    
    %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
    
    % Original method:
    function out = m1(P)
      if DO_SUM
        out = sum( sqrt( P(:,1).^2 + P(:,2).^2))/max(size(P));
      else
        out = mean( sqrt( P(:,1).^2 + P(:,2).^2));
      end
    end
    
    % pdist2 method:
    function out = m2(P)
      if DO_SUM
        out = sum(pdist2([0,0],P))/max(size(P));
      else
        out = mean(pdist2([0,0],P));
      end
    end
    
    % Shortened method #1:
    function out = m3(P)
      if DO_SUM  
        out = sum(sqrt(sum(P.*P,2)))/max(size(P));
      else
        out = mean(sqrt(sum(P.*P,2)));
      end    
    end
    
    % Shortened method #2:
    function out = m4(P)
      if DO_SUM  
        out = sum(sqrt(sum(P.^2,2)))/max(size(P));
      else
        out = mean(sqrt(sum(P.^2,2)));
      end    
    end
    
    % hypot
    function out = m5(P)
      if DO_SUM
        out = sum(hypot(P(:,1),P(:,2)))/max(size(P));
      else
        out = mean(hypot(P(:,1),P(:,2)));
      end
    end
    
    % (a+b)^2 formula , Divakar's idea
    function out = m6(P)
      % Since a^2 + b^2 = (a+b)^2 - 2ab, 
      if DO_SUM
        out = sum(sqrt(sum(P,2).^2 - 2*prod(P,2)))/max(size(P));
      else
        out = mean(sqrt(sum(P,2).^2 - 2*prod(P,2)));
      end
    end
    
    end
    

    Typical result on my R2016b + Win10 x64:

    >> q42372466(0) % with mean()
        0.1165    0.1971    0.2167    0.2161    0.1719    0.2375
    
    >> q42372466(1) % with sum()
        0.1156    0.1979    0.2233    0.2181    0.1610    0.2357
    

    Which means that your method is actually the best of the above, by a considerable margin!
    (Honestly - didn't expect that!)