Search code examples
regexvbscriptqtphp-uft

Regex Positive Lookbehind alternative in VBScript


So, VBScript apparently doesn't support Lookbehind at all.

I am looking for an alternative valid Regex that I can use with VBScript.

FYI, I will use this in HP UFT, so I've no choice but to use VBScript (If there is no other simplest way, I might have to look into other option like, executing Java (or other language) code from VBS).

What I am trying to achieve:
From a given bunch of text, I want to extract certain alpha-numeric string. This string may include -, _, ., /, //, etc.

Only thing I know is, this string will be followed by a specific word (for example DIA) and there will be a space after this string.

Here the VBS code snippet that I can use as an alternative:
This sample code is retrieving the first match only. I can modify it if I won't find other alternative.

serviceType = "DIA"

tempTxt = obj.GetROProperty("innertext")

If InStr(1, tempTxt, serviceType, 0) > 0 Then
    iStartPoint = InStr(1, tempTxt, serviceType, 0) + Len(serviceType)
End If

tempTxt = LTrim(Mid(tempTxt, iStartPoint))

iStartPoint = InStr(1, tempTxt, " ", 1)

MsgBox Left(tempTxt, iStartPoint)

Here is regex that I am using:

(?<=DIA\s).*?(?=\s)

Here is the demo of what I've tried and working successfully. I just need to find the VBScript alternative.


Update

Here is the result that I am getting after trying suggested regex:
(The return value looks different because I am using different input text.)

enter image description here

Here is the code that I am using:

Call RegExpMultiSearch(tempTxt, "DIA\s+(\S+)")

Public RegMatchArray

Function RegExpMultiSearch(targetString, ptrn)
    'CREATE THE REGULAR EXPRESSION
    Set regEx = New RegExp
    regEx.Pattern = ptrn
    regEx.IgnoreCase = True    'False
    regEx.Global = True

    'PERFORM THE SEARCH
    Set Matches = regEx.Execute(targetString)

    'REPORTING THE MATCHES COLLECTION
    If Matches.Count = 0 Then
        Actual_Res = "NO occurrence of pattern '" & ptrn & "' found in string '" & targetString & "'"
        Print Actual_Res
    Else
        'ITERATE THROUGH THE MATCHES COLLECTION
        For Each Match in Matches
            'ADD TO ARRAY
            ReDim Preserve arrArray(i)
            arrArray(i) = Match.Value
            i = i + 1
        Next
        Actual_Res = UBound(arrArray) - 1 & " occurrence of pattern '" & ptrn & "' found in string '" & targetString & "' successfully"
        Print Actual_Res
        RegMatchArray = arrArray
    End If

    If IsObject(regEx) Then Set regEx = Nothing End If
    If IsObject(Matches) Then Set Matches = Nothing End If
End Function

Final update

I got the desired result by using the suggested regex. Also I had to use SubMatches(0) instead of Match.Value.


Solution

  • You may re-vamp the regex into a pattern with a capturing group that will let you access just the value you need:

    DIA\s+(\S+)
    

    See the regex demo.

    Note you do not even need the lookahead since .*?(?=\s) matches any 0+ chars other than line break chars as few as possible up to the whitespace. Surely, if you need to check for a whitespace, just append \s at the end of the pattern.

    Pattern details

    • DIA - a DIA substring (prepend with \b word boundary if you need a whole word match)
    • \s+ - 1 or more whitespaces
    • (\S+) - Group 1: one or more chars other than whitespace chars.

    Here is a VBA test:

    Sub GetValues()
    Dim rExp As Object, allMatches As Object, match As Object
    Dim s As String
    
    s = "DIA 8778680044 SVU-RMW ANNISTON SERF1450 COMMERCE BLVD ANNISTONAL DIA DS1IT-15600804-123 SVU-RMW ANNISTON2130 ROBERTS DR ANNISTONAL"
    
    Set rExp = CreateObject("vbscript.regexp")
    With rExp
        .Global = True
        .MultiLine = False
        .Pattern = "DIA\s+(\S+)"
    End With
    
    Set allMatches = rExp.Execute(s)
    For Each match In allMatches
        WScript.Echo match.SubMatches.Item(0)
    Next
    
    End Sub
    

    Output:

    8778680044
    DS1IT-15600804-123