Search code examples
javarecursionstack-overflowunion-find

Why changing a line in my code will result in a stack overflow?


This is a [union-find problem]: https://leetcode.com/problems/similar-string-groups/

If I change the line parents[find(j)] = i; into parents[find(i)] = j;, the code will result in a stack overflow. Apparently the path is too deep for the recursive find() method. But I can't tell what difference does this change make. Can anyone help?

class Solution {
    int[] parents;
    public int numSimilarGroups(String[] A) {
        parents = new int[A.length];
        for(int i = 0;i < parents.length;i++) {
            parents[i] = i;
        }
        for(int i = 0;i < A.length;i++) {
            for(int j = 0;j < i;j++) {
                if(similar(A[i],A[j])) {
                    parents[find(j)] = i;
                }
            }
        }
        int ans = 0;
        for(int i = 0;i < parents.length;i++) {
            if(parents[i] == i)
                ans++;
        }
        return ans;
    }

    private int find(int curr) {
        int p = parents[curr];
        if(p != curr) {
            int pp = find(p);
            parents[curr] = pp;
        }
        return parents[curr];
    }

    private boolean similar(String a, String b) {
        int diff = 0;
        int i = 0;
        boolean consecutive = false;
        while(diff <= 2 && i < a.length()) {
            if(a.charAt(i) != b.charAt(i))
                diff++;
            if(i > 0 && a.charAt(i) == a.charAt(i-1))
                consecutive = true;
            i++;
        }
        return diff == 2 || diff == 0 && consecutive;
    }
}

Solution

  • Using parents[find(i)] = j allows a value to become smaller than its index by repeating the value that indexes can become. This can result in a situation where 2 elements have inversed indexes/values of each other. For example:

    Given A.length == 5, your starting array would look like:

    parents[0] = 0; parents[1] = 1; parents[2] = 2; parents[3] = 3; parents[4] = 4;

    The values we use will be for similar returning true. Starting with i = 2, j = 1, this would make the calls:

    find(2);    //Array doesn't change in recursive function
    
    //Resulting array after applying j to parents[2]:
    //          parents[0] = 0; parents[1] = 1; parents[2] = 1; parents[3] = 3; parents[4] = 4;
    

    Next, i = 3, j = 1:

    find(3);    //Array doesn't change in recursive function
    
    //Resulting array after applying j to parents[3]:
    //          parents[0] = 0; parents[1] = 1; parents[2] = 1; parents[3] = 1; parents[4] = 4;
    

    Then i = 3, j = 2:

    find(3); find(1);    //Array doesn't change in recursive function
    
    //Resulting array after applying j to parents[1]:
    //          parents[0] = 0; parents[1] = 2; parents[2] = 1; parents[3] = 1; parents[4] = 4;
    

    You can see now that we have our infinite loop set up (parents[1] = 2; parents[2] = 1). If find is called with 1 or 2, this will get stuck between these two values. We need two more steps to get there. i = 4, j = 1:

    find(4);    //Array doesn't change in recursive function
    
    //Resulting array after applying j to parents[1]:
    //          parents[0] = 0; parents[1] = 2; parents[2] = 1; parents[3] = 1; parents[4] = 1;
    

    Finally, i = 4, j = 2:

    find(4); find(1); find(2); find(1); find(2); find(1); find(2); ...
    

    Using parents[find(j)] = i means that the assigned value can't become lower because i always increments whereas j repeats for every iteration of i. j can be any value of 0 to i -1.