Search code examples
arraysexcelvbadebugginglcs

Longest Common Subsequence in VBA Giving #VALUE! Error


I've been putting together a UDF in Excel (365) to calculate the longest common subsequence between two strings (based on this implementation in python https://www.geeksforgeeks.org/printing-longest-common-subsequence/).

When I run the UDF I get a #Value! error on the worksheet. I've done some rudimentary debugging but I'm new to VBA and running into a wall. The message box statements in the code are simply for said crude debugging.

I believe the issue is in my manipulation of the L array. It seems to get to the first case in the first set of for loops then quit as it's evaluating L(i, j,) = 0. Any pointers on where I'm going wrong?

In the worksheet I'm using =ClosestMatch("aabbaaaa", "aaaabbaa") and getting #VALUE! as a result.

This is the VBA code for the UDF I'm attempting:

Function ClosestMatch(ByVal x As String, ByVal y As String, Optional ByVal return_String As Boolean = False) As Variant
    Dim xLen As Integer
    Dim yLen As Integer
    
    xLen = Len(x)
    yLen = Len(y)
    
    MsgBox "x = " & x & " y = " & y
    
    'Create Zeroed Array of xLen+1 x yLen+1 dimensions (intentional extra space).
    ReDim L((xLen + 1), (yLen + 1)) 'indexing starts at 0.
    For i = 0 To (xLen + 1)
        For j = 0 To (yLen + 1)
            L(i, j) = 0
        Next j
    Next i
    
    MsgBox "Created 0'ed array L"
    
    'Build dynamic programming table from the bottom up.
    'Note that L[xLen][yLen] will contain an integer equal to the length
    'of the complete LCS.
    'Note that L[i][j] contains the length of the lcs of x[0..i] and y[0..j]
    For i = 0 To (xLen + 1)
        For j = 0 To (yLen + 1)
            If i = 0 Or j = 0 Then
                L(i, j) = 0
            ElseIf Mid(x, i - 1, 1) = Mid(x, i - 1, 1) Then
                L(i, j) = L(i - 1, j - 1) + 1
            Else
                L(i, j) = WorksheetFunction.Max(L(i - 1, j), L(i, j - 1))
            End If
        Next j
    Next i
    
    'Length of LCS
    Dim LCSlen As Integer
    LCSlen = L(xLen, yLen)
    
    MsgBox "Length of the LCS is " & LCSlen
    
    'Start from the right-most-bottom-most corner and store chars
    'one by on in LCS
    Dim LCS As String
    
    LCS = ""
    i = xLen
    j = yLen
    
    While i > 0 And j > 0
            'If current character in x and y are same, then current char
            'is part of the LCS. The L[xLen][yLen] is the location of the
            'fist charachter we will PUSH onto the front of the LCS string
            If Mid(x, i - 1, 1) = Mid(x, i - 1, 1) Then
                LCS = Mid(x, i - 1, 1) & Right(LCS, Len(LCS))
            
            'If not same, then find the larger of the two lengths in L[][]
            'then go in the direction of the larger value
            ElseIf L(i - 1, j) > L(i, j - 1) Then
                i = i + 1
            Else
                j = j + 1
            End If
    Wend
    
    If return_String Then
        ClosestMatch = LCS
    Else
        ClosestMatch = LCSlen
    End If
    
End Function

Solution

  • After entirely too much timing staring at the watch window...I had a LOT of errors of me copying x where it should've been y, and typing i where it should've been j.

    As I haven't been able to find a VBA example of finding the Longest Common Subsequence, here it is...

    Public Function LCSMatch(ByVal x As Range, ByVal y As Range, Optional ByVal return_String As Boolean = False) As Variant
        Dim xLen As Integer
        Dim yLen As Integer
        
        xLen = Len(x)
        yLen = Len(y)
        
        
        'Create Zeroed Array of xLen+1 x yLen+1 dimensions (intentional extra space).
        ReDim L((xLen), (yLen)) 'indexing starts at 0.
        For i = 0 To (xLen)
            For j = 0 To (yLen)
                L(i, j) = 0
            Next j
        Next i
        
        'Build dynamic programming table from the bottom up...
        'Note that L[xLen][yLen] will contain an integer equal to the length
        'of the complete LCS.
        'Note that L[i][j] contains the length of the lcs of x[0..i] and y[0..j]
        For j = 0 To (yLen)
            For i = 0 To (xLen)
                If i = 0 Or j = 0 Then
                    L(i, j) = 0
                ElseIf Mid$(x, i, 1) = Mid$(y, j, 1) Then
                    L(i, j) = L(i - 1, j - 1) + 1
                Else
                    L(i, j) = WorksheetFunction.Max(L(i - 1, j), L(i, j - 1))
                End If
            Next i
        Next j
        
        'Length of LCS
        Dim LCSlen As Integer
        LCSlen = L(xLen, yLen)
    
        
        'Start from the right-most-bottom-most corner and store chars
        'one by on in LCS
        Dim LCS As String
        
        LCS = ""
        i = xLen
        j = yLen
            
            
            
            While i > 0 And j > 0
                'If current character in x and y are same, then current char
                'is part of the LCS. The L[xLen][yLen] is the location of the
                'fist charachter we will PUSH onto the front of the LCS string
                If Mid$(x, i, 1) = Mid$(y, j, 1) Then
                    LCSPart = Right$(LCS, Len(LCS))
                    LCS = Mid$(x, i, 1) & LCSPart
                    i = i - 1
                    j = j - 1
                    'GoTo Match
                'If not same, then find the larger of the two lengths in L[][]
                'then go in the direction of the larger value
                ElseIf L(i - 1, j) > L(i, j - 1) Then
                    i = i - 1
                Else
                    j = j - 1
                End If
    'Match:
        Wend
    
        MsgBox "Length of the LCS is " & LCSlen
        MsgBox "LCS is " & LCS
    
        If return_String Then
            LCSMatch = LCS
        Else
            LCSMatch = LCSlen
        End If
        
    End Function