I need to tokenize a mathematical expression using VBA. I have a working solution but am looking for a more efficient way of doing it (possibly RegExp).
My current solution:
Function TokeniseTheString(str As String) As String()
Dim Operators() As String
' Array of Operators:
Operators = Split("+,-,/,*,^,<=,>=,<,>,=", ",")
' add special characters around all "(", ")" and ","
str = Replace(str, "(", Chr(1) & "(" & Chr(1))
str = Replace(str, ")", Chr(1) & ")" & Chr(1))
str = Replace(str, ",", Chr(1) & "," & Chr(1))
Dim i As Long
' add special characters around all operators
For i = LBound(Operators) To UBound(Operators)
str = Replace(str, Operators(i), Chr(1) & Operators(i) & Chr(1))
Next i
' for <= and >=, there will now be two special characters between them instead of being one token
' to change < = back to <=, for example
For i = LBound(Operators) To UBound(Operators)
If Len(Operators(i)) = 2 Then
str = Replace(str, Left(Operators(i), 1) & Chr(1) & Chr(1) & Right(Operators(i), 1), Operators(i))
End If
Next i
' if there was a "(", ")", "," or operator next to each other, there will be two special characters next to each other
Do While InStr(str, Chr(1) & Chr(1)) > 0
str = Replace(str, Chr(1) & Chr(1), Chr(1))
Loop
' Remove special character at the end of the string:
If Right(str, 1) = Chr(1) Then str = Left(str, Len(str) - 1)
TokeniseTheString = Split(str, Chr(1))
End Function
Test using this string IF(TestValue>=0,TestValue,-TestValue)
gives me the desired solution.
Sub test()
Dim TokenArray() As String
TokenArray = TokeniseTheString("IF(TestValue>=0,TestValue,-TestValue)")
End Sub
I have never seen regular expressions before and tried to implement this into VBA. The problem I am having is that the RegExp
object in VBA doesn't allow positive lookbehind.
I will appreciate any more efficient solution than mine above.
As suggested by @Florent B, the following function gives the same results using RegExp:
Function TokenRegex(str As String) As String()
Dim objRegEx As New RegExp
Dim strPattern As String
strPattern = "(""(?:""""|[^""])*""|[^\s()+\-\/*^<>=,]+|<=|>=|\S)\s*"
With objRegEx
.Global = True
.MultiLine = False
.IgnoreCase = True
.Pattern = strPattern
End With
str = objRegEx.Replace(str, "$1" & ChrW(-1))
If Right(str, 1) = ChrW(-1) Then str = Left(str, Len(str) - 1)
TokenRegex = Split(str, ChrW(-1))
End Function