Search code examples
regexwindowsfilenamespcrebulk-rename-utility

RegEX: match ([^A-Z]*) but stop if you find "foo"


I ran into a problem that I can't figure out how to fix. I have a series of video files with names like this:

FooBar-tooGGG1 - s01e01 - (HW) - SomeText.mp4

So I have to make sure that spaces are added before capital letters BUT ONLY until "- s01e" appears (thus ignoring the rest of the text):

Foo Bar-too GGG 1 - s01e01 - (HW) - SomeText.mp4

Looking around I stumbled upon these RegEX:

(?-i)([A-Z]{1})([^A-Z]*)

Replace with: $1$2


.+?(?=abc)

or

\w*(?<!foo)bar

or

^(?:(?!foo).)*

and this played a little bit on Regex101 but I can't end up getting only two types of results:

(?-i)([A-Z]{1})([^A-Z]*.+?(?= - s01e))

or

(?-i)([A-Z]{1})([^A-Z]*)/g

Respectively:

F ooBar-tooGGG1 - s01e01 - (HW) - SomeText.mp4

and

Foo Bar-too GGG1 - s01e01 - ( H W) -  Some Text.mp4

I'm not very good at RegEx but I've been trying everything since this morning, put in the middle, use +? instead of * o. * etc.

RegEX Engine: PCRE2; PCRE. If bulk rename doesn't fit as software, I also have: FlexibleRenamer and RegexRenamer (also for Windows)


Solution

  • In Bulk Rename Utility, you can use

    (?<=[a-z])(?=[A-Z])(?=.*\s[sS]\d{1,2}[eE]\d)/g
    

    Set the replacement to a space. Make sure the v2 check box is set.

    enter image description here

    You may also use Powershell (CTRL+Esc, start typing Powershell and press ENTER):

    cd 'FILES_FOLDER_PATH_HERE'
    $files = Get-ChildItem -Filter '*.mp4'
    $files | Rename-Item  -NewName {$_.name -creplace '(?<=\p{Ll})(?=\p{Lu})(?=.*\s[sS]\d{1,2}[eE]\d)',' ' }
    

    Here is the regex demo.

    Details

    • cd 'FILES_FOLDER_PATH_HERE' - moving to the folder with your files
    • $files = Get-ChildItem -Filter '*.mp4' - getting all files in that folder with the mp4 extension
    • $files | Rename-Item -NewName {$_.name -creplace '(?<=\p{Ll})(?=\p{Lu})(?=.*\s[sS]\d{1,2}[eE]\d)',' ' } renames the files using a case sensitive (due to -creplace) regex search and replace.

    The regex matches

    • (?<=\p{Ll}) - a location immediately preceded with a lowercase letter (\p{Ll} is a Unicode variant of [a-z])
    • (?=\p{Lu}) - a location immediately followed with an uppercase letter (\p{Lu} is a Unicode variant of [A-Z])
    • (?=.*\s[sS]\d{1,2}[eE]\d) - a location immediately followed with
      • .* - any text (other than newlines)
      • \s - a whitespace
      • [sS] - s or S
      • \d{1,2} - one or two digits
      • [eE] - e or E
      • \d - a digit.