Search code examples
matlabgroupingmatrix-indexing

Compute mean on matrix by groups of indices matlab


I have the following data:

A = [1 2 ; 3 2; 4 7; 10 2; 6 7; 10 9]
B = [1 2 3; 4 4 9; 1 8 0; 3 7 9; 3 6 8]
C = [4; 10; 6; 3; 1]

A =
    1     2
    3     2
    4     7
   10     2
    6     7
   10     9

B =
    1     2     3
    4     4     9
    1     8     0
    3     7     9
    3     6     8

C.' =
    4    10     6     3     1

For each unique value in A(:,2) I need to take the corresponding values in A(:,1), look for their value in C, then take the relevant rows in B and compute their mean. The result should be length(unique(A(:,2)) x size(B,2);

The expected result for this example:

  • Value "2": mean of rows 2, 4 and 5 from B Explanation: Indices 1, 3 and 10 that correspond to value "2" in A are at indices 2, 4, 5 in C.

Correspondingly:

  • Value "7": mean of rows 1 and 3 from B.
  • Value "9": mean of row 2 from B.

I compute it now by applying unique on A and iterating each value, searching the right indices. My data set is quite large, so it takes quite a time. How can I avoid the loops?


Solution

  • Let's do what you say in the question step by step:

    1. For each unique value in A(:, 2):

      [U, ia, iu] = unique(A(:, 2));
      
    2. Take the corresponding values in A(:, 1) and look for their value in C:

      [tf, loc] = ismember(A(:, 1), C);
      

      It's also recommended to make sure, just in case, that all values are actually found in C:

      assert(all(tf))
      
    3. Then take the relevant rows in B and compute their mean:

      [X, Y] = meshgrid(1:size(B, 2), iu);
      result = accumarray([Y(:), X(:)], reshape(B(loc, :), 1, []), [], @mean);
      

    Hope this helps! :)

    Example

    %// Sample input
    A = [1 2 ; 3 2; 4 7; 10 2; 6 7; 10 9];
    B = [1 2 3; 4 4 9; 1 8 0; 3 7 9; 3 6 8];
    C = [4; 10; 6; 3; 1];
    
    %// Compute means
    [U, ia, iu] = unique(A(:, 2));
    [tf, loc] = ismember(A(:, 1), C);
    [X, Y] = meshgrid(1:size(B, 2), iu);
    result = accumarray([Y(:), X(:)], reshape(B(loc, :), [], 1), [], @mean);
    

    The result is:

    result = 
       3.3333   5.6667   8.6667
       1.0000   5.0000   1.5000
       4.0000   4.0000   9.0000