Search code examples
matlabtextinputtextscan

Reading in txt File -Matlab Line Extraction with Logic


Currently stumped with reading in a .txt file with the general layout described below:

(The .txt file follows this general layout "N" times)

-----------------------------------
Header Info 1
Desired data 1
More data
More data
-----------------------------------
Header Info 2
Desired data 2
More data
-----------------------------------
Header Info 3
Desired data 3
More data
More data
More data
More data
----------------------------------
Header Info N
Desired data N
More data
More data
More data
CLOSING DATA LINE

I would like to extract only the "Desired data" along with the final "CLOSING DATA LINE" but the twist is that there is varying "More data" lines in between that inhibits a simple line-by-line extraction pattern. There can be 0-hundreds of these "More data" lines...

I do know that my desired data is 2 lines below every "---------------" and was wondering if there is some sort of way to "detect" a "---------" and execute a line extraction 2 lines below that. Further, to get the final line, trying to implement logic to extract that line before.

I've thought of simply going through every line with fgetl and having if statements capture the "---------" with a strcmpare which seems pretty "brute force-ish". Any lightweight or efficient solutions?


Solution

  • You can try following example, assuming your text file named a.txt:

    % open and read file
    f = fopen('a.txt');
    d = textscan(f, '%s', 'Delimiter', '');
    
    % since d is a cell containing another cell array
    dd = d{1};
    
    % index of '-------' lines
    myidx = find(cellfun(@(DD)all(ismember(DD, '-')), dd));
    
    % output data
    mydata = [dd(myidx + 2); dd(end)];
    
    % close file
    fclose(f);
    

    Using all(ismember(dd{k}, '-')) gives you 1 if line dd{k} contains all -, and 0 if not. Then execute cellfun for that to get the array of 1 and 0 values, where 1 represents the line with all -. Finally, use find to get index of the 1 values.