Search code examples
matlabstrsplit

How to read and split words of a text file and then save in a single variable (MATLAB)


Given below is my text inside a text file:

<DOC>
<DOCNO>annotations/01/1515.eng</DOCNO>
<TITLE>Yacare Ibera</TITLE>
<DESCRIPTION>an alligator in the water;</DESCRIPTION>
<NOTES></NOTES>
<LOCATION>Corrientes, Argentina</LOCATION>
<DATE>August 2002</DATE>
<IMAGE>images/01/1515.jpg</IMAGE>
<THUMBNAIL>thumbnails/01/1515.jpg</THUMBNAIL>
</DOC>

How to split the words inside it a store in a single variable, like

x = 'annotations' '1515.eng' 'Yacare' ...and so on?


Solution

  • So you have two steps. First is to extract string between tags. Second is to split the extracted string using delimiters. I assume that the delimiters are / and (space). I also assume that your string is loaded from some file using importdata function.

    Then

    % load string from a file
    STR = importdata('testin');
    
    % extract string between tags
    B = regexprep(STR, '<.*?>','');
    
    % split each string by delimiters and add to C
    C = [];
    for i=1:length(B)
        if ~isempty(B{i})
            C = [C strsplit(B{i}, {'/', ' '})];
        end
    end