Search code examples
stringmatlabfiltercharacterfilenames

Filter Strings in Array for Specific Attributes Matlab


I have a column of strings in a table representing the filenames in a folder as produced by the dir function.

tmpList = struct2table(dir('myFolder'));

The folder contains many different file types and folders. I want only the excel files and I can find these by using:

filesData = [dir(['myFolder','\*.xlsx']);dir(['myFolder','\*.xls'])];

However how do I expand this/ replace this such that I can filter tmpList.name to include only files which have the following attributes:

  • First three letters are: 'DTE' (which occurs as both caps or small case)
  • All the characters following DTE are only numbers (and only numbers between 6 and 8 characters long)
  • Extension is .xlsx or .xls

example, for the following list only 1 and 2 are identified to keep:

  1. 'DTE123456.xlsx'
  2. 'Dte01234567.xls'
  3. 'abc12345678.xlsx'
  4. 'DTE12345c34.xls'
  5. 'DTE123456.doc'

Solution

  • You can use regular expression to find file names which meet your criteria and then cellfun to find indexes of those file names in tmpList:

    tmpList = struct2table(dir('myFolder')); % your beginning
    
    filteredTmpList = regexpi(tmpList.name(:), '^dte\d{6,8}\.xlsx?$', 'match');
    finalList = tmpList(~cellfun('isempty',filteredTmpList), :);