Search code examples
crecursiontreeisomorphism

From recursive to iterative function (tree isomorphism)


I want to convert a function that uses recursion to an iterative algorithm.

I have already looked at several explanations on how to switch from one to the other but most of them are limited to easy cases, like factorial, where the recursive function calls itself only once.

In my case I'm working on a function that checks whether two trees are isomorphic:

  • Two trees are isomorphic if one tree can be obtained from other by performing any number of flips, i.e. by swapping a node's left subtree with its right subtree, and doing this for any number of nodes, regardless of the different values of the nodes.

An example of isomorphic trees:

enter image description here

Here is the recursive version, where the number of recursive calls becomes exponential:

bool isIsomorphic(node* n1, node *n2){
 if (n1 == NULL && n2 == NULL)
    return true;
 if (n1 == NULL || n2 == NULL)
    return false;
 return
 (isIsomorphic(n1->left,n2->left) && isIsomorphic(n1->right,n2->right))||
 (isIsomorphic(n1->left,n2->right) && isIsomorphic(n1->right,n2->left));
}

How can I write this as an iterative algorithm?


Solution

  • When looking at the expression that makes recursive calls, we can see there are different states in which the recursive call is launched:

       (isIsomorphic(n1->left,n2->left) && isIsomorphic(n1->right,n2->right))||
    //  ^ initial state                    ^ had success on left/left       ^ success
       (isIsomorphic(n1->left,n2->right) && isIsomorphic(n1->right,n2->left));
    //  ^ failure on one of above           ^ had success on left/right     ^ success
    

    If we use an integer for enumerating states, we'll just need 4 intermediate states (not counting success/failure). We can picture the transitions as follows:

                             state 0
                                |
                           (left, left)
                          /            \
                         |           state 3
                          \             |
                           \      (right, right)
                            \    /              \
                             \  /             success
                              \/
                           state 1
                              |
                         (left, right)                  
                        /             \
                    failure         state 2
                                       |
                                 (right, left)
                                /             \
                            failure         success
    

    The pairs indicate which children are combined for a recursive call. A left side branch indicates the analysed subtrees are not isomorphic, while the right side branch indicates they are.

    In the stack-based solution you could therefore add the state to the stacked item, so that when you backtrack with either success or failure, you can find out what the resulting state should be:

    state push reason for popping next state
    0 left, left isomorphic 3
    0 left, left not isomorphic 1
    1 left, right isomorphic 2
    1 left, right not isomorphic not isomorphic
    2 right, left isomorphic isomorphic
    2 right, left not isomorphic not isomorphic
    3 right, right isomorphic isomorphic
    3 right, right not isomorphic 1

    Here is an implementation of the stack we would need (in C):

    // -------- Stack implemented as a linked list -------
    struct stack_item {
        struct node* n1; 
        struct node* n2;
        int state;
        struct stack_item* next;
    };
    
    struct stack_item* create_item(struct node* n1, struct node* n2, int state) {
        struct stack_item* newitem = malloc(sizeof *newitem);
        newitem->n1 = n1;
        newitem->n2 = n2;
        newitem->state = state;
        newitem->next = NULL;
        return newitem;
    }
    
    void stack_push(struct stack_item** stack, struct stack_item* item) {
        item->next = *stack;
        *stack = item;
    }
    
    struct stack_item* stack_pop(struct stack_item** stack) {
        struct stack_item* popitem = *stack;
        *stack = (*stack)->next;
        return popitem;
    }
    

    And here is the code for the isomophic function itself:

    int isomorphic(struct node* n1, struct node* n2) {
        struct stack_item* stack = NULL;
        struct stack_item* item = create_item(n1, n2, 0);
        
        while (1) {
            if (!item->n1 && !item->n2) {
                // success
                do {
                    free(item);
                    if (!stack) return 1; // overall success
                    item = stack_pop(&stack);
                } while (item->state >= 2);
                item->state ^= 3;
            } else if (!item->n1 || !item->n2) {
                // failure
                do {
                    free(item);
                    if (!stack) return 0; // overall failure
                    item = stack_pop(&stack);
                } while (item->state == 1 || item->state == 2); // failure
                item->state = 1;
            }
            stack_push(&stack, item);
            // The pair to look into depends on the state:
            item = create_item(
                (item->state & 2) ? item->n1->right : item->n1->left, 
                (item->state & 1) ? item->n2->right : item->n2->left, 
                0
            );
        }
    }
    

    Remark

    The original recursive algorithm is not optimal, because if the first recursive call returns true, while the second not, then it is impossible that the alternative two recursive calls (on the second line) would both return true. So actually, the recursive version could better be:

       isIsomorphic(n1->left,n2->left) 
            ? isIsomorphic(n1->right,n2->right)
            : isIsomorphic(n1->left,n2->right) && isIsomorphic(n1->right,n2->left);
    

    This means there are no cases anymore where four recursive calls are needed in the same execution context.

    The iterative function would need to be adapted accordingly, so that failure coming back to a state-3 node leads to failure on that node as well (the loop must continue):

            // failure
            do {
                free(item);
                if (!stack) return 0; // overall failure
                item = stack_pop(&stack);
            } while (item->state >= 1); // failure (also for state 3)