Search code examples
pythonlistlongest-substring

Python: Length of longest common subsequence of lists


Is there a built-in function in python which returns a length of longest common subsequence of two lists?

a=[1,2,6,5,4,8]
b=[2,1,6,5,4,4]

print a.llcs(b)

>>> 3

I tried to find longest common subsequence and then get length of it but I think there must be a better solution.


Solution

  • You can easily retool a Longest Common Subsequence (LCS) into a Length of the Longest Common Subsequence (LLCS):

    def lcs_length(a, b):
        table = [[0] * (len(b) + 1) for _ in range(len(a) + 1)]
        for i, ca in enumerate(a, 1):
            for j, cb in enumerate(b, 1):
                table[i][j] = (
                    table[i - 1][j - 1] + 1 if ca == cb else
                    max(table[i][j - 1], table[i - 1][j]))
        return table[-1][-1]
    

    Demo:

    >>> a=[1,2,6,5,4,8]
    >>> b=[2,1,6,5,4,4]
    >>> lcs_length(a, b)
    4
    

    If you wanted the longest common substring (a different, but related problem, where the subsequence is contiguous), use:

    def lcsubstring_length(a, b):
        table = [[0] * (len(b) + 1) for _ in range(len(a) + 1)]
        longest = 0
        for i, ca in enumerate(a, 1):
            for j, cb in enumerate(b, 1):
                if ca == cb:
                    length = table[i][j] = table[i - 1][j - 1] + 1
                    longest = max(longest, length)
        return longest
    

    This is very similar to the lcs_length dynamic programming approach, but we track the maximum length found so far (since it is no longer guaranteed the last element in the table is the maximum).

    This returns 3:

    >>> lcsubstring_length(a, b)
    3
    

    A sparse table variant to not have to track all the 0s (use this if a and b are potentially very large):

    def lcsubstring_length(a, b):
        table = {}
        longest = 0
        for i, ca in enumerate(a, 1):
            for j, cb in enumerate(b, 1):
                if ca == cb:
                    length = table[i, j] = table.get((i - 1, j - 1), 0) + 1
                    longest = max(longest, length)
        return longest