So, VBScript apparently doesn't support Lookbehind at all.
I am looking for an alternative valid Regex that I can use with VBScript.
FYI, I will use this in HP UFT, so I've no choice but to use VBScript (If there is no other simplest way, I might have to look into other option like, executing Java (or other language) code from VBS).
What I am trying to achieve:
From a given bunch of text, I want to extract certain alpha-numeric string. This string may include -
, _
, .
, /
, //
, etc.
Only thing I know is, this string will be followed by a specific word (for example DIA) and there will be a space after this string.
Here the VBS code snippet that I can use as an alternative:
This sample code is retrieving the first match only. I can modify it if I won't find other alternative.
serviceType = "DIA"
tempTxt = obj.GetROProperty("innertext")
If InStr(1, tempTxt, serviceType, 0) > 0 Then
iStartPoint = InStr(1, tempTxt, serviceType, 0) + Len(serviceType)
End If
tempTxt = LTrim(Mid(tempTxt, iStartPoint))
iStartPoint = InStr(1, tempTxt, " ", 1)
MsgBox Left(tempTxt, iStartPoint)
Here is regex that I am using:
(?<=DIA\s).*?(?=\s)
Here is the demo of what I've tried and working successfully. I just need to find the VBScript alternative.
Update
Here is the result that I am getting after trying suggested regex:
(The return value looks different because I am using different input text.)
Here is the code that I am using:
Call RegExpMultiSearch(tempTxt, "DIA\s+(\S+)")
Public RegMatchArray
Function RegExpMultiSearch(targetString, ptrn)
'CREATE THE REGULAR EXPRESSION
Set regEx = New RegExp
regEx.Pattern = ptrn
regEx.IgnoreCase = True 'False
regEx.Global = True
'PERFORM THE SEARCH
Set Matches = regEx.Execute(targetString)
'REPORTING THE MATCHES COLLECTION
If Matches.Count = 0 Then
Actual_Res = "NO occurrence of pattern '" & ptrn & "' found in string '" & targetString & "'"
Print Actual_Res
Else
'ITERATE THROUGH THE MATCHES COLLECTION
For Each Match in Matches
'ADD TO ARRAY
ReDim Preserve arrArray(i)
arrArray(i) = Match.Value
i = i + 1
Next
Actual_Res = UBound(arrArray) - 1 & " occurrence of pattern '" & ptrn & "' found in string '" & targetString & "' successfully"
Print Actual_Res
RegMatchArray = arrArray
End If
If IsObject(regEx) Then Set regEx = Nothing End If
If IsObject(Matches) Then Set Matches = Nothing End If
End Function
Final update
I got the desired result by using the suggested regex. Also I had to use SubMatches(0)
instead of Match.Value
.
You may re-vamp the regex into a pattern with a capturing group that will let you access just the value you need:
DIA\s+(\S+)
See the regex demo.
Note you do not even need the lookahead since .*?(?=\s)
matches any 0+ chars other than line break chars as few as possible up to the whitespace. Surely, if you need to check for a whitespace, just append \s
at the end of the pattern.
Pattern details
DIA
- a DIA
substring (prepend with \b
word boundary if you need a whole word match)\s+
- 1 or more whitespaces(\S+)
- Group 1: one or more chars other than whitespace chars.Here is a VBA test:
Sub GetValues()
Dim rExp As Object, allMatches As Object, match As Object
Dim s As String
s = "DIA 8778680044 SVU-RMW ANNISTON SERF1450 COMMERCE BLVD ANNISTONAL DIA DS1IT-15600804-123 SVU-RMW ANNISTON2130 ROBERTS DR ANNISTONAL"
Set rExp = CreateObject("vbscript.regexp")
With rExp
.Global = True
.MultiLine = False
.Pattern = "DIA\s+(\S+)"
End With
Set allMatches = rExp.Execute(s)
For Each match In allMatches
WScript.Echo match.SubMatches.Item(0)
Next
End Sub
Output:
8778680044
DS1IT-15600804-123