Search code examples
arraysmatlabmatlab-struct

Matlab sum struct array rows based on criteria


I have a struct array in Matlab as follows:

temp_links = struct('src',{},'dest',{}, 'type', {}, 'datarate', {});

the data in temp_links is as follows:

===============================
src    dest    type    datarate
================================
sw_1   sw_2    sw       23
sw_1   sw_2    sw       34
sw_1   sw_2    sw       2
sw_1   sw_2    sw       3
sw_1   sw_3    sw       5
sw_1   sw_3    sw       8
sw_1   sw_3    sw       9
sw_1   sw_3    sw       3
sw_1   sw_3    sw       23
sw_1   sw_3    sw       20
sw_2   dev1    dev      30
sw_2   dev1    dev      20
...
=============================

In the above case, I would like to sum the datarates for the same src and dest and get a new struct array as follows:

=============================
src    dest    type    datarate
================================
sw_1   sw_2    sw       62
sw_1   sw_3    sw       68
sw_1   dev1    dev      50
...
=============================

I am confused on how to achieve this. My thoughts were to have a switch case for each src field and then populate the dest. But I am pretty sure there is a simple way which hasn't hit me yet.

Could someone help me with this.


Solution

  • One approach could be to identify the unique rows using unique and then use some logical indexing to combine their data rates.

    For example:

    % Sample Data
    temp_links = struct('src',{'sw_1', 'sw_1', 'sw_1', 'sw_2', 'sw_2', 'sw_2'}, ...
                        'dest',{'sw_2', 'sw_2', 'sw_3', 'sw_1', 'dev_1', 'dev_1'}, ...
                        'type', {'sw', 'sw', 'sw', 'sw', 'dev', 'dev'}, ...
                        'datarate', {23, 34, 2, 5, 5, 5} ...
                        );
    
    % Locate and index each unique source, destination, and type
    [src_nodes, ~, src_idx] = unique({temp_links(:).src});
    [dest_nodes, ~, dest_idx] = unique({temp_links(:).dest});
    [types, ~, type_idx] = unique({temp_links(:).type});
    
    % Combine the indices and use to locate and index unique rows
    row_layout = [src_idx, dest_idx, type_idx];
    [unique_rows, ~, row_idx] = unique(row_layout, 'rows');
    
    % Initialize results table based on the unique rows
    joined_links = struct('src', {src_nodes{unique_rows(:,1)}}, ...
                          'dest', {dest_nodes{unique_rows(:,2)}}, ...
                          'type', {types{unique_rows(:,3)}}, ...
                          'datarate', [] ...
                          );
    
    % Sum data rates for identical rows
    for ii = 1:size(unique_rows, 1)
        joined_links(ii).datarate = sum([temp_links(row_idx==ii).datarate]);
    end
    

    For our sample input structure:

     src       dest      type     datarate
    ______    _______    _____    ________
    
    'sw_1'    'sw_2'     'sw'     23      
    'sw_1'    'sw_2'     'sw'     34      
    'sw_1'    'sw_3'     'sw'      2      
    'sw_2'    'sw_1'     'sw'      5      
    'sw_2'    'dev_1'    'dev'     5      
    'sw_2'    'dev_1'    'dev'     5  
    

    We recieve the following joined structure:

     src       dest      type     datarate
    ______    _______    _____    ________
    
    'sw_1'    'sw_2'     'sw'     57      
    'sw_1'    'sw_3'     'sw'      2      
    'sw_2'    'dev_1'    'dev'    10      
    'sw_2'    'sw_1'     'sw'      5 
    

    Alternatively, if you want to use MATLAB's Table datatype you can more easily utilize findgroups and splitapply to achieve the same result.

    Using the same temp_links struct from above:

    temp_links = struct2table(temp_links);
    groups = findgroups(temp_links.src, temp_links.dest, temp_links.type);
    combined_datarate = splitapply(@sum, temp_links.datarate, groups);
    
    [unique_groups, idx] = unique(groups);
    joined_links = temp_links(idx, :);
    joined_links.datarate = combined_datarate;
    

    Which also returns:

     src       dest      type     datarate
    ______    _______    _____    ________
    
    'sw_1'    'sw_2'     'sw'     57      
    'sw_1'    'sw_3'     'sw'      2      
    'sw_2'    'dev_1'    'dev'    10      
    'sw_2'    'sw_1'     'sw'      5