Efficiently constructing palindromes using string rewriting system

I am trying to solve the following challenge:

Given a string rewriting system with the following rules:

c -> a c b

c -> b a c a

c -> b c b a b

Given a starting string c, find an efficient approach to constructing a palindrome.

Construction example with rule sequence (1, 1, 2):

c -(rule 1)-> acb -(rule 1)-> aacbb -(rule 2)-> aabacabb

My attempts:

Generating permutations: I considered generating all permutations of a sequence of rewriting steps (n steps) but found it computationally expensive.
Balancing approach: Rule 2 unbalances the string leftward, while rule 3 unbalances it rightward. Rule 1 is neutral. This led me to explore sequences where rule 2 is applied twice as often as rule 3 to maintain balance.
Last rule constraint: Applying rule 1 last cannot create a palindrome, as it would introduce a non-palindromic trigram (acb) in the middle.

While balancing and last rule constraints reduce the search space, brute-forcing remains computationally intensive. Are there more structural approaches or tighter constraints to solve this problem efficiently?

I know that a solution to this within a sequence of 150 steps exists.

Solution

This comes down to searching a path in a graph, where the vertices are the current string of "a", "b" and "c", and the edges are the applications of one of the three transition rules.

Some preliminary observations

Not all rules lead to a state that can still become a palindrome, as also is apparent from the example that is given in the question. So those edges can be discarded. For example, from the very first state there is only one transition that still allows a palindrome to be formed, which is the rule c→bcbab. The other two rules would put different characters from the set ("a", "b") at the outer ends of the resulting string, which can never change again, and thus can never fit a palindrome.

Once a transition is performed, the characters at the outside of the string that are equal can be discarded, as they will be part of the final palindrome, and can never change anyway. By this elimination we can unify vertices in the search tree, as only the non-eliminated characters are relevant for the rest of the actions.

We will end up with a string that either starts with "c" or ends with "c", but it turns out that with these rules the "c" is always at the left. After the very first transition (where we have no choice), and the removal of the equal "b", the "c" is at the left of the resulting state: "cba". All transitions which are now possible, will always result in "c" at the left. Here are the patterns that we could get, and which rules can be applied. The asterisk represents any sequence of a and b (including no sequence at all):

pattern (current state)	rule	result	remainder (next state)
`c*a`	`c→acb`	`acb*a`	`cb*`
`cb`	`c→baca`	`bacab`	`c` (solved)
`c*ab`	`c→baca`	`baca*ab`	`ca*`
`c*b`	`c→bcbab`	`bcbab*b`	`cbab*`

So we actually only need to look at the remaining characters at the right of this "c" as there will never be anything that remains (to be resolved) at the left of it.

Solving the path-finding problem

You could apply a breath-first search, where you push an extension (of a state to the next state) to a queue. But some transitions are more interesting than others. In case the state ends in a there is no choice, but when it is b there are potentially two rules to choose from (see above table). When the current string ends with "ab" we could apply rule c→baca (transition 3 in the above table) or rule c→bcbab (transition 4 in the above table). The first one is more interesting because it eliminates two characters from both sides, and only adds one to the remainder: this is the only transition that reduces the length of the remaining characters. The other one is less interesting, as it makes the remaining string longer.

So we could improve a plain breadth-first search by giving some priority to states in the queue. A very basic improvement would be to use a deque, and put the state at the front when it is shorter than the state we came from (in terms of the "remaining" string), and at the back of the deque otherwise. This turns out to be a critical improvement.

Implementation

Here is an implementation in JavaScript:

function getSequence() {
    // We use a "primitive" priority queue: a deque
    // A deque's entry has two parts:
    //   - an array with applied actions (i.e. rules as strings)
    //   - the string at the right side of "c", but in reverse (for easier manipulation)
    // At the start only rule "c→bcbab" works out, and the outer "b" can be removed.
    // So the deque will start with this "cba" as entry. We don't store the "c", as it is always at the left.
    //  and the remainder of the string is stored in reverse, so "ab":
    const deque = [ [["bcbab"], "ab"] ]; // After action [c→bcbab], we have a left-over of "cba" (reversed)
    const visited = new Set();
    for (let i = 1; i < 1e6; i++) { // Put a hard limit on the process
        const [seq, right] = deque.shift(); // Get a next state
        if (visited.has(right)) {
            continue; // Avoid processing the same state twice
        }
        visited.add(right);
        if (right == "b") {
            seq.push("baca"); // This rule solves it
            console.log(`Solved after ${i} iterations`);
            return seq;
        } else if (right[0] == "a") { 
            seq.push("acb");
            // With this rule we can cancel out "a", but add a "b"
            deque.push([seq, right.slice(1) + "b"]);
        } else { // right[0] == "b"
            if (right[1] == "a") {
                // As we have two applicable rules, copy the current sequence (concat).
                // With this rule we can cancel out "ba", but add "a" (the only case where we GAIN)
                // Give priority by enqueuing at the left side of the deque
                deque.unshift([seq.concat("baca"), right.slice(2) + "a"]);
            }
            seq.push("bcbab");
            // With this rule we can cancel out "b", but add "bab", making it longer :-(
            deque.push([seq, right.slice(1) + "bab"]);
        }
    }
    throw "Could not find a solution in a reasonable number of iterations";
}

function applySequence(seq) {
    console.log(`The solution consists of ${seq.length} actions:`);
    let s = "c";
    let palindrome = "c";
    for (let i = 0; i < seq.length; i++) {
        const action = seq[i];
        // Apply action 
        s = s.replace("c", action);
        palindrome = palindrome.replace("c", action);
        // Keep removing outer-side characters when they are equal 
        //   so to focus the output on what remains to be done.
        while (s.length > 0 && s[0] === s.at(-1)) {
            s = s.slice(1, -1);
        }
        console.log(`${i+1}. c → ${action}: ${s}`);
    }
    console.log("palindrome:", palindrome);
}

const seq = getSequence();
applySequence(seq);

Produced solution:

Runnning the snippet you'll see it almost immediately prints out a solution that consists of 44 actions:

1. c → bcbab: cba
2. c → acb: cbb
3. c → bcbab: cbabb
4. c → bcbab: cbabbab
5. c → baca: cababb
6. c → bcbab: cbababab
7. c → baca: cababab
8. c → bcbab: cbabababa
9. c → acb: cbbababab
10. c → baca: cabbabab
11. c → baca: caabbab
12. c → baca: caaabb
13. c → bcbab: cbabaaab
14. c → baca: cababaa
15. c → acb: cbababa
16. c → acb: cbbabab
17. c → baca: cabbab
18. c → baca: caabb
19. c → bcbab: cbabaab
20. c → baca: cababa
21. c → acb: cbabab
22. c → baca: cabab
23. c → baca: caab
24. c → bcbab: cbabaa
25. c → acb: cbbaba
26. c → acb: cbbbab
27. c → baca: cabbb
28. c → bcbab: cbababb
29. c → bcbab: cbabbabab
30. c → baca: cababbab
31. c → baca: caababb
32. c → bcbab: cbabaabab
33. c → baca: cababaab
34. c → baca: caababa
35. c → acb: cbaabab
36. c → baca: cabaab
37. c → baca: caaba
38. c → acb: cbaab
39. c → baca: caba
40. c → acb: cbab
41. c → baca: cab
42. c → baca: ca
43. c → acb: cb
44. c → baca:

The so constructed palindrome is:

babbbabbabababababbaaabababbaabababaababbbababbabaababaabaababaabacabaababaabaababaababbababbbab

This is also the shortest solution in terms of number of applied rules:

This is an A* algorithm

We can prove that this solution is the shortest by showing that the proposed algorithm actually performs an A* search algorithm where the heuristic admissible cost is defined by the number of "b" characters in the remainder string. Add to that the already incurred cost, which is the path length of the already performed actions, and we have the lower-bound for the total cost. It turns out that the "interesting" transition (c→baca) is the only one that keeps that cost equal (the increase of the path length is countered by reduction of the "b" count). The other (non-finishing) two actions both increase that cost with 2. This means that at any time we only have two categories of states in terms of cost: those with the current cost, and those with a cost that is two units greater. And this is exactly what we achieve with the deque where we append the states that increase the cost, and prepend those that don't: the states are visited in order of A*-cost.

The above algorithm could also be performed by a pure breadth-first search. For this you'd replace the unshift call with a push call putting the "interesting" state also at the end of the queue like all others.

In that case the states will be visited in order of path-length, and we will also be sure to find the shortest path to the solution. With that change, the code will run for a few minutes (135 seconds on my laptop) and eventually output the same solution, as expected.

This is not the only solution, as obviously you could continue from the solved state and repeat the same sequence again (applying 88 actions). But discarding that, there are also some other solutions you could find, like this one, which has 56 actions, taking a different branch at step 11:

1. c → bcbab: cba
2. c → acb: cbb
3. c → bcbab: cbabb
4. c → bcbab: cbabbab
5. c → baca: cababb
6. c → bcbab: cbababab
7. c → baca: cababab
8. c → bcbab: cbabababa
9. c → acb: cbbababab
10. c → baca: cabbabab
11. c → bcbab: cbababbaba
12. c → acb: cbbababbab
13. c → baca: cabbababb
14. c → bcbab: cbababbabab
15. c → baca: cabababbab
16. c → baca: caabababb
17. c → bcbab: cbabaababab
18. c → baca: cababaabab
19. c → bcbab: cbabababaaba
20. c → acb: cbbabababaab
21. c → baca: cabbabababa
22. c → acb: cbabbababab
23. c → baca: cababbabab
24. c → baca: caababbab
25. c → bcbab: cbabaababba
26. c → acb: cbbabaababb
27. c → bcbab: cbabbbabaabab
28. c → baca: cababbbabaab
29. c → baca: caababbbaba
30. c → acb: cbaababbbab
31. c → baca: cabaababbb
32. c → bcbab: cbababaababb
33. c → bcbab: cbabbababaabab
34. c → baca: cababbababaab
35. c → baca: caababbababa
36. c → acb: cbaababbabab
37. c → baca: cabaababbab
38. c → baca: caabaababb
39. c → bcbab: cbabaabaabab
40. c → baca: cababaabaab
41. c → baca: caababaaba
42. c → acb: cbaababaab
43. c → baca: cabaababa
44. c → acb: cbabaabab
45. c → baca: cababaab
46. c → baca: caababa
47. c → acb: cbaabab
48. c → baca: cabaab
49. c → baca: caaba
50. c → acb: cbaab
51. c → baca: caba
52. c → acb: cbab
53. c → baca: cab
54. c → baca: ca
55. c → acb: cb
56. c → baca: