I'm studying algorithm about disjoint sets.
And the chapter of Fast FIND Implementation (Quick Find)
The data structure is shown below:
for example)
int array[5]
[0] = 3,
[1] = 5,
[2] = 7,
[3] = 1,
[4] = 2,
In [number] = set name
(above example), number is the element in the set name.
So, number 0 is in the set 3, number 1 is in the set 5,.., etc.
To do Union(a, b) (assuming that a is in the set i and b is in set j), it needs O(n) operations. I know this. The reason is shown below(Pseudo code):
void Union(a, b) {
int set1 = Find(a); /* set 1 will be 'i' */
int set2 = Find(b); /* set 2 will be 'j' */
if(set 1 != set2)
for(int i = 0; i < sizeof(array); i++) { /* O(n) operation */
if(array[i] == set1)
array[i] = set2;
}
}
But, in the book, I can't understand this:
A sequence of n - 1 unions take O(n^2) time in the worst case. If there are O(n^2) FIND operations, this performance is fine, as the average time complexity is O(1) for each UNION or FIND operation. If there are fewer FINDs, this complexity is not acceptable.
I can't understand what's the meaning of these sentence, and why the complexity is O(n^2). Can't imagine this complexity like code I write above.
We want to make the total complexity for all operations as minimum as possible.
Total complexity = complexity(FIND) + complexity(UNION)
As you stated if we are given that a sequence of n - 1 unions take O(n^2) time in the worst case. And find takes O(1) on average.
So for n-1 union operations how many find can we afford without increasing total complexity greater than O(n^2).
The answer is O(n^2) find operations can be supported.
So as long as the number of FIND is <= O(n^2) and number of UNION is <= O(n) . Complexity remains same.
General rule: The heavier operation (more time consuming) should be used as less number of times as the weight of lighter operation and the lighter operation should be used as many number of times as the weight of heavier operation.
I hope this is sufficient.