Search code examples
algorithmtheorycomplexity-theorynp-completeproof

String to string correction problem np-completeness proof


I have this assignment to prove that this problem:

Finite alphabet £, two strings x,y € £*, and a positive integer K. Is there a way to derive the string y from the string x by a sequence of K or fewer operations of single symbol deletion or adjacent symbol interchange?

is np-complete. I already figured out I have to make transformation from decision version of set covering problem, but I have no clue how to do this. Any help would be appreciated.


Solution

  • It looks like modified Levenshtein distance. Problem can be solved with DP in quadratic time.

    Transformation from minimum set cover (MSC) to this string correction problem is described in:

    Robert A. Wagner
    On the complexity of the Extended String-to-String Correction Problem
    1975, Proceedings of seventh annual ACM symposium on Theory of computing 
    

    In short with MSC problem:

    Given finite sets x_1, ..., x_n, and integer L, does there exists a subset J of {1,...,n} such that |J| <= L, and

    union_{j in J} x_j = union all x_i ?

    Let w = union all x_i, let t = |w| and r = t^2, and choose symbols Q, R, S not in w.

    Take strings:

    A = Q^r R x_1 Q^r S^(r+1) ... Q^r R x_n Q^r S^(r+1)
    B = R Q^r ... R Q^r w S^(r+1) ... S^(r+1)   <- each ... is n times
    and
    k = (l+1)r - 1 + 2t(r+1)(n-1) + n(n-1)(r+1)^2/2 + (r*n + |x_1 ... x_n| - t)*W_d
    [W_d is delete operation weight, can be 1.]
    

    It is shown that string-string correction problems (A,B,k) is satisfiable iff source MSC problem is.

    From strings construction it is clear that proof is not trivial :-) But it isn't too complex to manage.