Search code examples
stringalgorithmsequences

Interesting strings algorithm


Given two finite sequences of string, A and B, of length n each, for example:

A1: "kk", A2: "ka", A3: "kkk", A4: "a"

B1: "ka", B2: "kakk", B3: "ak", B4: "k"

Give a finite sequences of indexes so that their concentration for A and B gives the same string. Repetitions allowed.

In this example I can't find the solution but for example if the list (1,2,2,4) is a solution then A1 + A2 + A2 + A4 = B1 + B2 + B2 + B4. In this example there are only two characters but it's already very difficult. Actually it's not even trivial to find the shortest solution with one character!

I tried to think of things.. for example the total sum of the length of the strings must be equal and the for the first and last string we need corresponding characters. But nothing else. I suppose for some set of strings it's simply impossible. Anyone can think of a good algorithm?

EDIT: Apparently, this is the Post Correspondence Problem

There is no algorithm that can decide whether a such an instance has a solution or not. If there were, the halting problem could be solved. Dirty trick...


Solution

  • It is not clear what the 'solution' you are looking for is, the longest solution? the shortest? all solutions?
    Since you allow repetition there will an infinite number of solutions for some inputs so I will work on:

    Find all sequences under a fixed length.

    Written as a pseudo code but in a manner very similar to f# sequence expressions

    // assumed true/false functions
    
    let Eq aList bList =  
    // eg Eq "ab"::"c" "a" :: "bc" -> true
    // Eq {} {} is _false_
    
    let EitherStartsWith aList bList =  
    // eg "ab"::"c" "a" :: "b" -> true
    // eg "a" "ab" -> true
    // {} {} is _true_    
    
    let rec FindMatches A B aList bList level
        = seq {
            if level > 0
                if Eq aList bList
                    yield aList
                else if EitherStartsWith aList bList
                    Seq.zip3 A B seq {1..} 
                    |> Seq.iter (func (a,b,i) -> 
                        yield! FindMatches A B aList::(a,i) bList::(b,i) level - 1) }
    
    let solution (A:seq<string>) (B:seq<string>) length =
        FindMatches A B {} {} length
    

    Some trivial constraints to reduce the problem:

    1. The first selection pair must have a common start section.
    2. the final selection pair must have a common end section.

    Based on this we can quickly eliminate many inputs with no solution

    let solution (A:seq<string>) (B:seq<string>) length =
        let starts = {}
        let ends = {}
        Seq.zip3 A B seq {1..} 
        |> Seq.iter(fun (a,b,i) -> 
            if (a.StartsWith(b) or b.StartsWith(a))
                start = starts :: (a,b,i)
            if (a.EndsWith(b) or b.EndsWith(a))
                ends = ends :: (a,b,i))
    
        if List.is_empty starts || List.is_empty ends
            Seq.empty // no solution
        else
            Seq.map (fun (a,b,i) -> 
                FindMatches A B {} :: (a,i) {} :: (b,i) length - 1)
            starts 
            |> Seq.concat