Search code examples
matlabcsv

Split CSV file based on characters in specific column


I have a csv file, with a layout as follows:

Easting Northing Elevation_OD Chainage FC Profile Reg_ID
245932.499 200226.982 5.464 148.997 CE N/A 8c12.11_021
245932.460 200225.448 5.214 150.530 S N/A 8c12.11_021
245932.485 200224.993 5.111 153.222 S N/A 8c12.1_001

The spreadsheet has ~55,000 rows and I need to be able to split the CSV based on specific research units - the '8c12.11' specifies a particular unit and the '_021' specifies a particular transect line and can be ignored. There are numerous units within this file and the character length of the units can vary e.g. '8c12.1' and I'm uncertain as to how to compensate for this alteration. Essentially the split needs to be able to identify the correct unit:

8c12.11 - The characters prior to the transect line (e.g _021)

and then compile all of the associated rows that correspond to this unit, into a new table.


Solution

  • The easiest way to do this is using extractBefore

    data.Unit = extractBefore( data.Reg_ID, '_' );