Search code examples
powershell

Specifying *.xls filter in Get-ChildItem also returns *.xlsx results


I have a folder that contains both .xls, .xlsx and .xlsm files, and would like to filter just the .xls files.

Why is the following line not working as I'd expect it to? I see .xls, .xlsx and .xlsm results.

Get-ChildItem $(Get-Location) -Filter *.xls | ForEach-Object { $_.Extension }

Solution

  • The -Filter parameter's wildcard matching is not performed by PowerShell, it is passed through to the filesystem provider and ultimately the Windows API. The matching performed there is burdened with many legacy behaviors and quirks, including the one you saw:

    • In Windows PowerShell, -Filter *.xls effectively behaves like -Filter *.xls*. Therefore, -Filter *.xls matches both foo.xls and foo.xlsx, for instance; this happens, because the 8.3 (short) file names are also being matched behind the scenes; for instance, foo.xlsx's 8.3 file name looks something like FOO~1.XLS; note the truncation (and capitalization) of .xlsx to .XLS.

    • While the short-name matching behavior no longer occurs in PowerShell (Core) 7, fortunately, other legacy quirks persist[1], as does the most notable difference (which won't go away): only PowerShell wildcard expressions (see about_Wildcards) support character ranges / sets via [...] (e.g., [a-z]) - they're not supported with -Filter.

    • Use of the -Filter parameter is in general still preferable to -Path / -Include (see below) due to its superior performance (filtering happens at the source, instead of after the fact in PowerShell).

    The workaround is to use the -Path parameter in order to use PowerShell's wildcard matching:

    Get-ChildItem -Path (Join-Path (Get-Location) *.xls) | ForEach-Object { $_.Extension }
    
    # Or, more simply
    Get-ChildItem -Path $PWD/*.xls | ForEach-Object Extension
    

    Note: With -Recurse you'd use the -Include parameter instead; without -Recurse, the behavior of -Include (and -Exclude) is unintuitive, unfortunately - see the bottom section of this answer.


    [1] Notable other quirks:

    • Multiple consecutive ? wildcards can match names with fewer characters.

      • E.g., Get-ChildItem -Filter ??.txt matches aa.txt and unexpectedly also a.txt
    • Pattern *. matches extension-less file and directory names.

      • E.g., Get-ChildItem -File -Filter *. finds all files (-File) whose names do not have an extension (e.g., file); this quirk can actually be useful, in that it is the simplest and best-performing way to locate extension-less files (-Path *. does not work, because it looks for a file name literally ending in a .).

      • Note: This was temporarily changed in PowerShell Core 6.x (as of 6.2.3), but the behavior is back as of PowerShell Core 7.0.

    • Conversely, *.* includes extension-less file and directory names as well.

    See this excellent answer by Zenexer for the backstory and the gory details.