Search code examples
regexvbavbscriptcatia

Extract specific substring which not preceding with specified substring


I have a question that is relatively simple but brought me to the wall. I want to have a regex do following task in VBA (VBScript_RegExp_5.5).

Given a string like this:

"PrivateFactoryAsclsFactoryPrivateFactoryAsclsFactory"

I want to remove only the occurrences of "Factory" that are not preceded by "cls". For this particular case, if all matches are correctly replaced with "_", the result will look like this:

"Private_AsclsFactoryPrivate_AsclsFactory"

Of course, a simple exclusion trick like "clsFactory|(Factory)" does not works in VBA, nor do look-behinds.


Solution

  • You actually can do a negative lookbehind with a bit of a hack using the VBScript Regex library. The following pattern matches any two or less characters or any three characters that aren't "cls" (using a negative lookahead), followed by "Factory".

    Const strText As String = "PrivateFactoryAsclsFactoryPrivateFactoryAsclsFactory"
    
    With CreateObject("VBScript.RegExp")
        .Pattern = "(^.{0,2}|(?!cls).{3})Factory"
        .Global = True
        Debug.Print .Replace(strText, "$1_")
    End With
    

    Output:

    Private_AsclsFactoryPrivate_AsclsFactory