I have a batch of files with names like: 78887_16667_MR12_SMITH_JOHN_713_1.pdf
I need to retain the first three sets of numbers and remove everything between the third "_" and "_1.pdf".
So this: 78887_16667_MR12_SMITH_JOHN_713_1.pdf
Becomes this: 78887_16667_MR12_1.pdf
Ideally, I'd like to be able to just use the 3rd "_" as the break as the third set of numbers sometimes includes 3 characters, sometimes 4 characters (like the example) and other times, 5 characters.
If I used something like this:
Get-ChildItem Default_*.pdf | Rename-Item -NewName {$_.name -replace...
...and then I'm stuck: can I state that everything from the 3rd "" and the 6th "" should be replaced with "" (nothing)? My understanding that I'd include ".Extension" to also save the extension, too.
You can use the -split
operator to split your name into _
-separated tokens, extract the tokens of interest, and then join them again with the -join
operator:
PS> ('78887_16667_MR12_SMITH_JOHN_713_1.pdf' -split '_')[0..2 + -1] -join '_'
78887_16667_MR12_1.pdf
0..2
extracts the first 3 tokens, and -1
the last one (you could write this array of indices as 0, 1, 2, -1
as well).
Applied in the context of renaming files:
Get-ChildItem -Filter *.pdf | Rename-Item -NewName {
($_.Name -split '_')[0..2 + -1] -join '_'
} -WhatIf
Common parameter -WhatIf
previews the rename operation; remove it to perform actual renaming.