I have a txt file that appears in notepad++ like this:
/a/apple 1
/b/bat 10
/c/cat 22
/d/dog 33
/h/human/female 34
Now I want to extract everything after second slash before the numbers at the end. So the output I want is:
out = {'apple'; 'bat'; 'cat'; 'dog'; 'human/female'}
I wrote this code:
file= fopen('file.txt');
out= textscan(file,'%s','Delimiter','\n');
fclose(file);
it gives:
out =
{365×1 cell}
out{1} =
'/a/apple 1'
'/b/bat 10'
'/c/cat 22'
'/d/dog 33'
'/h/human/female 34'
How can I get the required output from the text file (directly if possible)? Or any regular expression if directly getting the required output is not possible?
You can get the desired output directly from textscan
, without any further processing needed:
file = fopen('file.txt');
out = textscan(file, '/%c/%s %d');
fclose(file);
out = out{2}
out =
5×1 cell array
'apple'
'bat'
'cat'
'dog'
'human/female'
Note that the two slashes in the format specifier string will be treated as literal text to ignore in the output. Any additional slashes will be captured in the string (%s
). Also, it is unnecessary to specify a delimiter argument since the default delimiter is whitespace, so the trailing number will be captured as a separate numeric value (%d
).