Search code examples
regexexcelvbatokenizemathematical-expressions

Tokenise mathematical (infix) expression in VBA


I need to tokenize a mathematical expression using VBA. I have a working solution but am looking for a more efficient way of doing it (possibly RegExp).

My current solution:

Function TokeniseTheString(str As String) As String()

Dim Operators() As String
' Array of Operators:
Operators = Split("+,-,/,*,^,<=,>=,<,>,=", ",")

' add special characters around all "(", ")" and ","
str = Replace(str, "(", Chr(1) & "(" & Chr(1))
str = Replace(str, ")", Chr(1) & ")" & Chr(1))
str = Replace(str, ",", Chr(1) & "," & Chr(1))

Dim i As Long
' add special characters around all operators
For i = LBound(Operators) To UBound(Operators)
    str = Replace(str, Operators(i), Chr(1) & Operators(i) & Chr(1))
Next i

' for <= and >=, there will now be two special characters between them instead of being one token
' to change <  = back to <=, for example
For i = LBound(Operators) To UBound(Operators)
    If Len(Operators(i)) = 2 Then
        str = Replace(str, Left(Operators(i), 1) & Chr(1) & Chr(1) & Right(Operators(i), 1), Operators(i))
    End If
Next i

' if there was a "(", ")", "," or operator next to each other, there will be two special characters next to each other
Do While InStr(str, Chr(1) & Chr(1)) > 0
    str = Replace(str, Chr(1) & Chr(1), Chr(1))
Loop
' Remove special character at the end of the string:
If Right(str, 1) = Chr(1) Then str = Left(str, Len(str) - 1)

TokeniseTheString = Split(str, Chr(1))

End Function

Test using this string IF(TestValue>=0,TestValue,-TestValue) gives me the desired solution.

Sub test()
Dim TokenArray() As String
TokenArray = TokeniseTheString("IF(TestValue>=0,TestValue,-TestValue)")
End Sub

I have never seen regular expressions before and tried to implement this into VBA. The problem I am having is that the RegExp object in VBA doesn't allow positive lookbehind.

I will appreciate any more efficient solution than mine above.


Solution

  • As suggested by @Florent B, the following function gives the same results using RegExp:

    Function TokenRegex(str As String) As String()
    Dim objRegEx As New RegExp
    Dim strPattern As String
    
    strPattern = "(""(?:""""|[^""])*""|[^\s()+\-\/*^<>=,]+|<=|>=|\S)\s*"
    With objRegEx
        .Global = True
        .MultiLine = False
        .IgnoreCase = True
        .Pattern = strPattern
    End With
    
    str = objRegEx.Replace(str, "$1" & ChrW(-1))
    If Right(str, 1) = ChrW(-1) Then str = Left(str, Len(str) - 1)
    TokenRegex = Split(str, ChrW(-1))
    
    End Function