Search code examples
pythonmatlabdatatablesdata-analysis

Matlab - Assigning matching variables between two data sets and creating a new table


I'm currently working with two data sets, as shown below:

Data set 1:

Point_ID Record Difference (m)
'2804AJGCA57' 'Record003 - 220428_103738_Scanner_1 - 2804AJGCA57' '0.035240'
'2804AJGCA28' 'Record003 - 220428_103738_Scanner_1 - 2804AJGCA28' '0.030961'
'2804AJGCA29' 'Record003 - 220428_103738_Scanner_1 - 2804AJGCA29' '0.030219'

Data set 2:

Point_ID Easting Northing Elevation_OD
'2804AJGCA1' '200305.3884' '80809.76627' '7.25913'
'2804AJGCA2' '200304.9855' '80809.20396' '7.23274'
'2804AJGCA3' '200304.3783' '80808.51888' '7.20207'

Essentially, I need to compare the 1st column of both tables and if the 'Point_ID' from the 2nd data set is found within the 1st, I need to add the 'Difference (m)' column onto the corresponding 'Point_ID' row, in the 2nd data set, or a new table.

I hope that this makes sense. Currently I have the following code:

%% Import CSVs

GCA_data = readtable('Input/Test/GCA&GCP_Results_Flight1.csv', 'Delimiter',';', 'Format','%s %s %s'); %Insert pathway to GCA_Results csv.

XYZ_data= readtable('Input/Test/GCA&GCP_Flight1.csv','Delimiter',',','Format','%s %s %s %s'); %Insert pathway to the GCA XYZ file inputted into RiProcess.


%% Pre - Settings

ids = GCA_data.Object1; %Identifies all points that were used within the GCA calculations

nids = numel(ids); % Identifies the number of unique point ids.

gca_table = [];

for ii = 1:nids; 
   
    ID = ids{ii}; %Speicifies the point ID. 
    
    idx = ismember()
end

Solution

  • This is a type of join operation. MATLAB can do various different joins on table. See this doc page for more. I'm not sure, but I think you want a "left" join - this is what you need if some entries aren't present, like in this example:

    t1 = table(["aaa"; "bbb"; "ccc"], [1; 2; 3], ...
        'VariableNames', {'Point_ID', 'Difference (m)'});
    t2 = table(["bbb"; "ccc"; "ddd"], [200; 300; 400], ...
        'VariableNames', {'Point_ID', 'OtherValue'});
    
    % Match up rows by "Point_ID", get "Difference (m)" if
    % available and add to "OtherValue". 
    % MergeKeys=true means keep only one copy of the key "Point_ID"
    outerjoin(t2, t1, type="left", MergeKeys=true)
    

    Which gets:

      3×3 table
    
        Point_ID    OtherValue    Difference (m)
        ________    __________    ______________
    
         "bbb"         200               2      
         "ccc"         300               3      
         "ddd"         400             NaN