Search code examples
matlabgroupingmatlab-table

Group 4 tables into one, with each table as one column?


I have 4 tables (A,B,C,D) in Matlab with 102 columns (e.g. X, Y, W, Z), all containing 52000 rows (e.g. 0,1,2,...).

I want to merge them into one big table containing all of the data.

This is the output I want:


T   Column_names   A    B    C     D  

0       X          a(0)  b(0)  c(0)  d(0)
0       Y          a(0)  b(0)  c(0)  d(0)
0       W          a(0)  b(0)  c(0)  d(0)
0       Z          a(0)  b(0)  c(0)  d(0)
1       X          a(1)  b(1)  c(1)  d(1)
1       y          a(1)  b(1)  c(1)  d(1)
1       w          a(1)  b(1)  c(1)  d(1)
1       z          a(1)  b(1)  c(1)  d(1)
2    ...
...

Solution

  • I've created an example of 3 tables (A,B,C), each with 3 columns (X,Y,Z) and 4 rows.

    Then the following steps achieve what you want...

    1. It's trivial to add the row index T.

    2. You can then use stack to create a tall table with the columns stacked (and labelled as a new column)

    3. Finally an outerjoin will merge all of the tables together. You could just concatenate them, but this has two disadvantages

      • You would have to deal with duplicate column names
      • You would have to assume that the rows are in the same order.

    The code is as follows, please see the comments for details.

    % Dummy data
    X = (1:12).';
    Y = rand(12,1);
    Z = primes(40).';
    % Create tables with 4 rows each
    A = table( X(1:4), Y(1:4), Z(1:4), 'VariableNames', {'X','Y','Z'} );
    B = table( X(5:8), Y(5:8), Z(5:8), 'VariableNames', {'X','Y','Z'} );
    C = table( X(9:12), Y(9:12), Z(9:12), 'VariableNames', {'X','Y','Z'} );
    
    % Add the row index T
    A.T = (1:size(A,1)).';
    B.T = (1:size(B,1)).';
    C.T = (1:size(C,1)).';
    
    % Joining
    % First, stack the tables to get column names as a column
    As = stack( A, {'X','Y','Z'}, 'IndexVariableName', 'Column_names', 'NewDataVariableName', 'A' );
    Bs = stack( B, {'X','Y','Z'}, 'IndexVariableName', 'Column_names', 'NewDataVariableName', 'B' );
    Cs = stack( C, {'X','Y','Z'}, 'IndexVariableName', 'Column_names', 'NewDataVariableName', 'C' );
    
    % Now just concatenate the tables.
    % We can do this robustly with a 'join'.
    tbls = {As,Bs,Cs};
    % Loop over the tables for greatest flexibility
    output = tbls{1};
    for ii = 2:numel(tbls)
        output = outerjoin( output, tbls{ii}, 'Keys', {'T','Column_names'}, 'MergeKeys', true );
    end
    

    Output:

    output