Search code examples
powershell

transferring files that starts with digits from source directory to destination directory


I want a PowerShell script that copies the files that starts with 4 digits from source directory to destination directory. It also searches the source directory for any sub folders and copy the files from the sub folders that has starting of 4 digits and copy them to destination directory. It should leave all the files that starts with alphabet or any file name that has name starting with less than 4 digits. Below are the images for what I want to achieve:

enter image description here

I want to copy all the files except the one that does not start with four digits and the ones that does not have any digits at all so from the above screenshot, I don't want 1.pdf and New Test Documet.txt file to be copied to destination folder. This is what I tried to write:

Get-ChildItem -Recurse -File -LiteralPath D:\testDigits -Include [0-9][0-9][0-9][0-9] -exclude [a-z][A-Z] |
  Copy-Item -Destination D:\transferTo\ 

transferTo is the name of the destination folder. The above script is transferring all the files including 1.pdf and New Test Documet.txt to the destination folder. Not sure, what am I doing wrong.


Solution

  • Both the -Include and -Exclude parameters of provider cmdlets such as Get-ChildItem expect wildcard expressions that must match the names of the input items in full.

    Thus, assuming the following:

    • all files to match have a filename extension; to not enact this constraint, replace *.* with just * below.
    • the base file name (the part before the extension) may either start with 4 digits, followed by arbitrary additional characters or comprise 4 digits only

    you can use [0-9][0-9][0-9][0-9]*.*

    In the context of your command:

    Get-ChildItem -Recurse -File -LiteralPath D:\testDigits -Include [0-9][0-9][0-9][0-9]*.* |
      Copy-Item -Destination D:\transferTo\
    

    Note that you then do not need an -Exclude argument too.

    Caveat:

    • Due to your use of -Recurse, -Include and -Exclude do behave as expected, but in non-recursive processing they exhibit unintuitive behavior - see the bottom section of this answer.

    If you need more sophisticated matching, use the approach suggested by Daniel:

    The - slower - (near-)equivalent of the above is:

    Get-ChildItem -Recurse -File -LiteralPath D:\testDigits |
      Where-Object Name -match '^[0-9]{4}'
      Copy-Item -Destination D:\transferTo\
    

    Note:

    • Unlike wildcard expressions, which match against the input string in full, regexes match substrings by default.

    • The ^ in regex ^[0-9]{4} ensures that four ({4}) decimal digits ([0-9]) only match at the start of each input string.

      • As such, the above would also match extension-less files, unlike the [0-9][0-9][0-9][0-9]*.* wildcard expression, but like the [0-9][0-9][0-9][0-9]* variant.

        • Requiring a file-name extension would require a more complex regex, such as
          ^[0-9]{4}[^.]*\..
      • Note: \d is a commonly used shortcut class to refer to a digit, though, strictly speaking, it also matches what characters classified as digits in different scripts across the entire Unicode "alphabet", which aren't just the ASCII-range 0 to 9 characters, but also, for instance, ٠ (ARABIC-INDIC DIGIT ZERO, U+0660).
        However, in practice this is rarely a concern, and \d is frequently used even when only 0 through 9 should be matched.


    Performance considerations:

    • A wildcard-based -Include / -Exclude outperforms a post-filtering solution based on Where-Object.

    • The fastest - but limited - solution (which is not an option in the case at hand) is to use the -Filter parameter, because it filters at the provider source rather than retrieving all files first and then performing filtering.

      • The wildcard expressions accepted by -Filter (in the context of the FileSystem provider) are in essence limited to the * and ? constructs, along with obscure legacy behavior for backward-compatibility - see this answer.