Longest Common Subsequence
Definition
Given two strings, find the length of the longest subsequence present in both of them. A subsequence is a sequence that appears in the same relative order, but not necessarily contiguously.
- Example: “ACE” is a subsequence of “ABCDE”.
How it Works
We compare the two strings character by character.
- Match: If characters match (e.g., ‘A’ == ‘A’), we add 1 to the result and move both pointers forward.
- No Match: If they don’t match, we branch into two possibilities:
- Skip a character from string A.
- Skip a character from string B.
- Take the maximum of these two options.
Code Example
PythonJavaCC++
# Longest Common Subsequence (Dynamic Programming)
def lcs(X, Y):
m = len(X)
n = len(Y)
# Create DP table
dp = [[0]*(n+1) for _ in range(m+1)]
for i in range(1, m+1):
for j in range(1, n+1):
if X[i-1] == Y[j-1]:
dp[i][j] = 1 + dp[i-1][j-1]
else:
dp[i][j] = max(dp[i-1][j], dp[i][j-1])
return dp[m][n]
s1 = "AGGTAB"
s2 = "GXTXAYB"
print("Length of LCS is", lcs(s1, s2)) # 4
public class LCS_DP {
public static int lcs(String X, String Y) {
int m = X.length();
int n = Y.length();
int[][] dp = new int[m + 1][n + 1];
for (int i = 1; i <= m; i++) {
for (int j = 1; j <= n; j++) {
if (X.charAt(i - 1) == Y.charAt(j - 1))
dp[i][j] = 1 + dp[i - 1][j - 1];
else
dp[i][j] = Math.max(dp[i - 1][j], dp[i][j - 1]);
}
}
return dp[m][n];
}
public static void main(String[] args) {
String s1 = "AGGTAB";
String s2 = "GXTXAYB";
System.out.println("Length of LCS is " + lcs(s1, s2));
}
}
#include <stdio.h>
#include <string.h>
int max(int a, int b) {
return (a > b) ? a : b;
}
int lcs(char X[], char Y[]) {
int m = strlen(X);
int n = strlen(Y);
int dp[m+1][n+1];
for (int i = 0; i <= m; i++) {
for (int j = 0; j <= n; j++) {
if (i == 0 || j == 0)
dp[i][j] = 0;
else if (X[i-1] == Y[j-1])
dp[i][j] = 1 + dp[i-1][j-1];
else
dp[i][j] = max(dp[i-1][j], dp[i][j-1]);
}
}
return dp[m][n];
}
int main() {
char s1[] = "AGGTAB";
char s2[] = "GXTXAYB";
printf("Length of LCS is %d\n", lcs(s1, s2));
return 0;
}
#include <iostream>
#include <vector>
using namespace std;
int lcs(string X, string Y) {
int m = X.length();
int n = Y.length();
vector<vector<int>> dp(m+1, vector<int>(n+1,0));
for (int i = 1; i <= m; i++) {
for (int j = 1; j <= n; j++) {
if (X[i-1] == Y[j-1])
dp[i][j] = 1 + dp[i-1][j-1];
else
dp[i][j] = max(dp[i-1][j], dp[i][j-1]);
}
}
return dp[m][n];
}
int main() {
string s1 = "AGGTAB";
string s2 = "GXTXAYB";
cout << "Length of LCS is "
<< lcs(s1, s2);
return 0;
}
Complexity Analysis
- Time Complexity: O(M times N) where M and N are string lengths.
- Space Complexity: O(M times N) for the table.
Use Cases
- Diff Tools: Comparing two versions of a file (like Git).
- DNA Sequencing: Finding similarities between gene strands.