For my code which is an implementation of the quick find algorithm i generally consider my code for understanding theoretical questions
#include <stdio.h>
#define N 10000
main() {
int i, t, p, q, id[N];
for (i = 0; i < N; i++)
id[i] = i;
printf("Input pair p q: ");
while (scanf("%d %d", &p, &q) == 2) {
if (id[p] == id[q])
printf("%d %d already connected\n", p,q);
else {
for (t = id[p], i = 0; i < N; i++)
if (id[i] == t)
id[i] = id[q];
printf("pair %d %d not yet connected\n", p, q);
}
printf("Input pair p q: ");
}
}
lets consider a graph 12-6, 2-3, 0-9, 5-4, 3-10, 10-8 Supposing that in the graph there are N vertices, that there are k pairs and that quickfind is used for the find operation, how many find operations will be performed by the on-line connectivity algorithm? O(kN) Θ(kN) O(k) Θ(k) which there will help me find the number of operations. I know for the find operation the unit cost is O(1) because all we need for it to do is point to another array. but I am confused when concerned with a certain number of pair.
I am starting to learn this complexity analysis and the notations confused me because I know the big teta is tightly bound and is more precise for expressing complexity analysis than the Big O, but for the case of finding a number of operations, I just could not understand which one fits more.
There are two plausible problem dimensions with respect to which one can consider the program's asymptotic complexity: the number of nodes, N
, and the number of queries, k
. Considering either one of these constant (no matter what its value) means that it factors out of the complexity analysis, because asymptotic complexity is about how some measure of the program's behavior scales with changes in the problem size. I will consider the scaling in both of these dimensions, according to the abstract-machine semantics of the C language in which the program is expressed:
the program spends Θ(N) operations to initialize the id
array, regardless of the number of queries.
a single query requires Ω(1), O(N) steps,
id[p] == id[q]
then that query takes Θ(1) steps (a type 1 query).id
values by at least 1. There are initially N
distinct id
values, and that number cannot be reduced below 1, so at most N
-1 queries can be of this type.The combined number of steps required for all the queries of the second type is bounded by (N
-1) * O(N
) = O(N2). (N
-1 queries, each requiring O(N
) steps).
I can choose a list of N
queries that will require Ω(N2) steps overall, so there is no tighter upper bound.
N
-2, N
-1; N
-3, N
-1; ...; 0, N
-1. Each query will iterate from 0 to p
, for a total of (N
-1)*N
/2 iterations overall.That gives us:
Ω | O | Explanation |
---|---|---|
N | N | setup |
1* | k | type 1 queries |
1* | N2 | type 2 queries |
k + N | k + N + N2 | total |
k + N | k + N2 | standard reduced form |
* There are k
queries altogether, so that is a better lower bound on the total contribution of the queries than is 1+1 = 2.
I am starting to learn this complexity analysis and the notations confused me because I know the big teta is tightly bound and is more precise for expressing complexity analysis than the Big O, but for the case of finding a number of operations, I just could not understand which one fits more.
"More tightly bound" is a potentially misleading characterization of big-Theta. The difference between Θ and O is that O provides only an upper bound, whereas Θ provides both an upper and a lower bound. And in this context, one should not neglect big-Omega (Ω), which provides only a lower bound. You can always express upper (O) and lower (Ω) asymptotic bounds for a given function, but it is not always possible to express a useful two-sided bound.
Thus, one typically starts by computing big-O. This is not unique, but one normally tries to find as tight a bound as possible. The above analysis addresses the tightness of the upper bound by showing that there cannot be a standard-form upper bound any tighter than the one found.
One sometimes continues by computing big-Ω. This is useful for considering whether there are categories of inputs for which the algorithm is more efficient. For example, insertion sort requires O(N2) steps in the average and worst cases, but it is Ω(N) overall because it sorts inputs that are already approximately in order in linear time.
If one can find a convenient functional form that can serve as both big-O and big-Ω, then that is also a useful big-Θ. If not, then one usually just ignores big-Θ. For example, we can always say that f(N) is Θ(f(N)), but we generally don't, because that does not give us any useful information.