I'm trying to access the numerical data in a text file that is produced by a measurement device. The files are a mix of strings and numbers, and items are separated by tabs. It is not rectangular, which makes them not suitable for matlab's file loading functions, such as dlmread. I can convert the files to exel spreadsheets and load them with xlsread, but as I am dealing with hundreds of files, it is not an option for me.
Here is what the files look like:
Heading String
Commenttext
X Axis Trace A Trace B
Frequency Funct Ch1 Funct Ch2
Frequency / Hz Funct Ch1 / dBr Funct Ch2 / dBV
Hz dBr dBV
LOG LIN LIN
100 -60.35 -71.1446
10000 -7.48178 -12.0321
VOID VOID VOID
VOID VOID VOID
Scan #-5 Labeltext Labeltext
Hz V V
1.000000000000e+003 3.999999910593e-002 1.000000000000e+000
A-X/37 A-Y A-LimLow A-LimUpp A-RefY in V B-X/37 B-Y B-LimLow B-LimUpp B-RefY in V
1.000000000000e+002 -4.873095199691e+001 VOID VOID 3.999999910593e-002 1.000000000000e+002 -2.026775796775e+001 VOID VOID 1.000000000000e+000
1.172102297533e+002 -4.492478734843e+001 VOID VOID 3.999999910593e-002 1.172102297533e+002 -2.024411835772e+001 VOID VOID 1.000000000000e+000
1.373823795883e+002 -3.994765661259e+001 VOID VOID 3.999999910593e-002 1.373823795883e+002 -2.022767912575e+001 VOID VOID 1.000000000000e+000
1.610262027561e+002 -3.628116388971e+001 VOID VOID 3.999999910593e-002 1.610262027561e+002 -2.021939551014e+001 VOID VOID 1.000000000000e+000
1.887391822135e+002 -3.285059881019e+001 VOID VOID 3.999999910593e-002 1.887391822135e+002 -2.021928320409e+001 VOID VOID 1.000000000000e+000
2.212216291070e+002 -2.987476652701e+001 VOID VOID 3.999999910593e-002 2.212216291070e+002 -2.022570411546e+001 VOID VOID 1.000000000000e+000
2.500000000000e+002 -2.751190343935e+001 VOID VOID 3.999999910593e-002 2.500000000000e+002 -2.023346726318e+001 VOID VOID 1.000000000000e+000
2.592943797405e+002 -2.728162367715e+001 VOID VOID 3.999999910593e-002 2.592943797405e+002 -2.023601503648e+001 VOID VOID 1.000000000000e+000
etc.
Note that: - The value in front of Scan #-5 tells me how many repetitions there are in the file - A-X/37 tells me how many lines per scan there are. - The 'VOID' elements are replaced by actual numbers when applicable, and vice versa.
I would like to be able to extract the values that are present, and that my script is not dependent on the size of the traces, the number of repetitions or whether the values are numbers or 'VOID'.
So far, based on the tips given in other treads, I was able to create an array with the rows of the file and extract the 2 first numbers of each row like this:
fid=fopen('data.txt');
Rows = textscan(fid,'%s', 'delimiter','\n'); %Creates a temporary array with the rows of the file
fclose(fid);
TraceStarts=strfind(Rows{1,1},'Scan'); %Looks for the start of each trace..
TracesIdx = find(~cellfun('isempty', TraceStarts)); %.. and stores the indexes.
Traces= cellfun(@(x) textscan(x,'%f','delimiter','\t', 'CollectOutput',1), Rows{1,1});
Traces is then a cell containing arrays for each row with at max the two first numbers per row. Why it stops searching for float items in a row after it found 2 is a mystery to me... I don't want to specify the sequence of numbers and strings in each row, because they might vary.
Any help would be much appreciated.
Your textscan
is only reporting two elements due to the VOID
elements. Simply tell textscan to treat those as 'empty' so it reads them as NaN. See the TreatAsEmpty
name-value pair in the textscan documentation.
>> str = '1.000000000000e+002 -4.873095199691e+001 VOID VOID 3.999999910593e-002 1.000000000000e+002 -2.026775796775e+001 VOID VOID 1.000000000000e+000';
>> textscan(str,'%f','delimiter','\t', 'CollectOutput',1)
ans =
[2x1 double]
>> textscan(str,'%f','delimiter','\t', 'CollectOutput',1,'treatAsEmpty',{'VOID'})
ans =
[10x1 double]
>> ans{1}
ans =
100
-48.73095199691
NaN
NaN
0.03999999910593
100
-20.26775796775
NaN
NaN
1