Search code examples
matlab

update table and add missing columns - matlab


t1 = struct2table(struct('a',1, 'b',2, 'c',3));
t2 = struct2table(struct('a',NaN, 'c',4, 'd',5));

t1 =
    a    b    c
    _    _    _
    1    2    3

t2 =
     a     c    d
    ___    _    _
    NaN    4    5

I'd like to update t1 with the values from t2 on the common columns, and also add the unique t2 columns to t1.

I tried:

>> join(t1,t2, 'Keys',{'a','c'})
Error using tabular/join (line 172)
The key variables cannot contain any missing values.

>> outerjoin(t1,t2, 'Keys',{'a','c'}, 'Merge',1)

ans =

  2×4 table

     a      b     c     d 
    ___    ___    _    ___

      1      2    3    NaN
    NaN    NaN    4      5

Desired output:

     a      b     c     d 
    ___    ___    _    ___

    NaN      2    4     5

Solution

  • Your definition is

    Update t1 with the values from t2 on the common columns, and also add the unique t2 columns to t1.

    Which sounds equivalent to

    Use all columns from t2, and the columns from t1 which are not present in t2

    That's easier to parse into code, since you can do

    t1 = struct2table(struct('a',1, 'b',2, 'c',3));
    t2 = struct2table(struct('a',NaN, 'c',4, 'd',5));
    
    t1Vars = t1.Properties.VariableNames;
    t2Vars = t2.Properties.VariableNames;
    t3 = [t1( :, ~ismember(t1Vars, t2Vars)), t2];
    

    Which gives

    t3 =
      1×4 table
        b     a     c    d
        _    ___    _    _
    
        2    NaN    4    5
    

    The column order doesn't exactly match your "desired output", which seems like you want to prioritise the variables from t1 first, you can do this with

    t3 = t3( :, [t1Vars, setdiff(t2Vars,t1Vars,'stable')] );
    

    Which gives the desired output

    t3 =
      1×4 table
         a     b    c    d
        ___    _    _    _
    
        NaN    2    4    5