I'm trying to read a data file (*.txt) through Matlab. One line of this data file would include 19 columns (each separated by a tab or space). However, the specific structure of the output file "wraps" each data line to include only 15 columns, and the next 4 lines go into a new line. After a certain amount of lines, the structure changes to 6 columns. I'm adding the following screenshot (mind you, not from the shared test file, but this is the same structure) for ease of explanation but the attached data file should explain things further.
As can be seen, lines 1~7556 have 19 columns (15 in one line and 4 in the next line, wrapped), and the lines 7557~ have 6 columns. These two structures repeat in the data file. Here is the pastebin link for the sample test file.
I tried the following code with no luck.
fid = fopen('test.txt');
C = cell2mat(textscan(fid, '%f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f %f'));
fclose(fid);
How can I read the data, and maybe get two separate readings (or data sets) that include the two different data structures?
Since you have consecutive delimiters, and delimiters at the start of some rows, its not super straight forward using the normal csv/text read functions of Matlab. But reading the entire file, and then deciding per number of missing values to which dataset a line/row belongs, works. See comments for explanation.
% read csv, with consecutive delimiter join option
data = readmatrix('test.txt','Delimiter',' ','ConsecutiveDelimitersRule', 'join');
n_nans = sum(isnan(data),2); % identify number of nans per row
% get the row indices and identify for which dataset the rows are
rows_set1_1 = n_nans == 1; % one nan per full row for first dataset
rows_set1_2 = n_nans == 12; % 12 nans per row for the 'wrapped' lines
rows_set2 = n_nans == 10;
% divide data in two datasets
dataset1 = [data(rows_set1_1, :), data(rows_set1_2, :)]; % horizontal concatenate set 1 & 2
dataset2 = data(rows_set2, :);
% remove nan columns
dataset1(:,all(isnan(dataset1),1)) = [];
dataset2(:,all(isnan(dataset2),1)) = [];