Search code examples
stringalgorithmtime-complexitynp

Why is the longest common subsequence for multiple strings NP-Hard?


I'm having a tough time understanding why the longest common subsequence problem for multiple strings (k > 2) is NP-Hard. I know that the LCS problem for two strings of length l1, l2 can be solved in O(l1*l2) time. My question is why can't we find the LCS of two strings at a time, for example:

LCS(abcd, ad, abc) = LCS(LCS(abcd, ad), abc) = LCS(ad, abc) = a

This algorithm would take O(k*Max_length^2) for k strings. Why is this wrong?


Solution

  • It is not generally true that LCS(x, y, z) = LCS(x, LCS(y, z)). For example:

    LCS(bb, aaabb, bbaaa) = bb

    LCS(bb, LCS(aaabb, bbaaa)) = LCS(bb, aaa) ≠ bb