Search code examples
javaalgorithmoptimizationdynamic-programminglcs

How can I optimize the algorithm to find the longest common subsequence (LCS) between two strings?


An algorithm used to efficiently find the LCS between two strings:

  1. Define two strings, string1 and string2, with lengths m and n, respectively.
  2. Create a 2D array, dp, with dimensions (m+1) x (n+1). Initialize all the elements of dp to 0.
  3. Iterate over the characters of string1 from left to right (i) and string2 from top to bottom (j).
  4. If string1[i] is equal to string2[j], increment dp[i+1][j+1] by 1. This means the current characters match, and the length of the LCS ending at this point is one more than the LCS ending at the previous characters.
  5. If string1[i] is not equal to string2[j], take the maximum value between dp[i][j+1] and dp[i+1][j] and store it in dp[i+1][j+1]. This means the current characters do not match, so the length of the LCS ending at this point is the maximum of the LCS ending at the previous characters in either string1 or string2.
  6. After iterating over all characters, the value of dp[m][n] will be the length of the LCS between string1 and string2.
  7. To retrieve the actual LCS, start from dp[m][n] and backtrack through the dp array. If string1[i-1] is equal to string2[j-1], add the character to the LCS and move diagonally to dp[i-1][j-1]. Otherwise, move to the left if dp[i][j-1] is greater than dp[i-1][j], or move upward if dp[i-1][j] is greater.
  8. Repeat step 7 until reaching the top-left corner of the dp array.

By following this algorithm, the longest common subsequence between two strings can be found in programming.

An example of Java code:

public static String repeatCharacter(char ch, int length) {
    StringBuilder sb = new StringBuilder(length);
    for (int i = 0; i < length; i++) {
        sb.append(ch);
    }
    return sb.toString();
}

Scanner myObj = new Scanner(System.in);
char ch = 'X';

System.out.println("Enter size of first string");
int m = myObj.next();
String string1 = repeatCharacter(ch, m);
System.out.println("Enter size of first string");
int m = myObj.next()
String string2 = repeatCharacter(ch, n);
String lcs = "";

System.out.println("Enter first string");
string1 = myObj.nextLine(); 
System.out.println("Enter second string");
string1 = myObj.nextLine(); 
int dp[m+1][n+1];
int i, j;

for (i = 0; i < m + 1; i++) {
     for (j = 0; j < n + 1; j++) {
          dp[i][j] = 0;
     }
}

for (i = 0; i < m; i++) {
     for (j = 0; j < n; j++) {
          if (string1[i] = string2[j])
              dp[i+1][j+1]++;
          else
              dp[i+1][j+1] = Math.max(dp[i][j+1], dp[i+1][j]);
     }
}

i = m, j = n;
while (i > -1 && j > -1) {
    if (string1[i-1] = string2[j-1]) {
        lcs = lcs + string1[i-1];
        i--;
        j--;
    }
    else if (dp[i][j-1] > dp[i-1][j])
        i--;
    else
        j--;
}

However, using nested for loops can take O(m n) in both time and space.

How can I optimize the time complexity and the space complexity of this algorithm?


Solution

  • In this case the linear search approach is more suitable and efficient :

    public class LCS {
        public static void main(String[] args) {
            Scanner scanner = new Scanner(System.in);
    
            System.out.println("Enter first string:");
            String string1 = scanner.nextLine();
    
            System.out.println("Enter second string:");
            String string2 = scanner.nextLine();
    
            int[][] dp = calculateLCS(string1, string2);
    
            String lcs = findLCS(dp, string1, string2);
    
            System.out.println("Length of Longest Common Subsequence: " + dp[string1.length()][string2.length()]);
            System.out.println("Longest Common Subsequence: " + lcs);
        }
    
        private static int[][] calculateLCS(String string1, String string2) {
            int m = string1.length();
            int n = string2.length();
    
            int[][] dp = new int[m + 1][n + 1];
    
            IntStream.range(0, m + 1).forEach(i -> dp[i][0] = 0);
            IntStream.range(0, n + 1).forEach(j -> dp[0][j] = 0);
    
            IntStream.range(0, m).forEach(i ->
                    IntStream.range(0, n).forEach(j -> {
                        if (string1.charAt(i) == string2.charAt(j)) {
                            dp[i + 1][j + 1] = dp[i][j] + 1;
                        } else {
                            dp[i + 1][j + 1] = Math.max(dp[i][j + 1], dp[i + 1][j]);
                        }
                    })
            );
    
            return dp;
        }
    
        private static String findLCS(int[][] dp, String string1, String string2) {
            StringBuilder lcs = new StringBuilder();
            int i = string1.length(), j = string2.length();
            while (i > 0 && j > 0) {
                if (string1.charAt(i - 1) == string2.charAt(j - 1)) {
                    lcs.insert(0, string1.charAt(i - 1));
                    i--;
                    j--;
                } else if (dp[i - 1][j] > dp[i][j - 1]) {
                    i--;
                } else {
                    j--;
                }
            }
            return lcs.toString();
        }
    }