Search code examples
powershellbatch-rename

Retain initial characters in file names, remove all remaining characters using powershell


I have a batch of files with names like: 78887_16667_MR12_SMITH_JOHN_713_1.pdf

I need to retain the first three sets of numbers and remove everything between the third "_" and "_1.pdf".

So this: 78887_16667_MR12_SMITH_JOHN_713_1.pdf

Becomes this: 78887_16667_MR12_1.pdf

Ideally, I'd like to be able to just use the 3rd "_" as the break as the third set of numbers sometimes includes 3 characters, sometimes 4 characters (like the example) and other times, 5 characters.

If I used something like this:

Get-ChildItem Default_*.pdf | Rename-Item -NewName {$_.name -replace... 

...and then I'm stuck: can I state that everything from the 3rd "" and the 6th "" should be replaced with "" (nothing)? My understanding that I'd include ".Extension" to also save the extension, too.


Solution

  • You can use the -split operator to split your name into _-separated tokens, extract the tokens of interest, and then join them again with the -join operator:

    PS> ('78887_16667_MR12_SMITH_JOHN_713_1.pdf' -split '_')[0..2 + -1] -join '_'
    78887_16667_MR12_1.pdf
    

    0..2 extracts the first 3 tokens, and -1 the last one (you could write this array of indices as 0, 1, 2, -1 as well).

    Applied in the context of renaming files:

    Get-ChildItem -Filter *.pdf | Rename-Item -NewName {
        ($_.Name -split '_')[0..2 + -1] -join '_'
      } -WhatIf
    

    Common parameter -WhatIf previews the rename operation; remove it to perform actual renaming.