Let G = (V, E) be a weighted, connected and undirected graph. Let T be the edge set that is grown in Kruskal's algorithm and stopped after k iterations (so T might contain less than |E|-1 edges). Let W(T) be the weighted sum of this set. Let T’ be an acylic edge set such that |T| = |T’|. Prove that W(T) <= W(T’)
I understand the original proof of the algorithm and I’ve tried several approaches to tackle this, neither worked.
For example: I thought an induction on |T| might work. For |T| = 1 it’s obvious.
We assume correctness for |T|=k and prove (or not…) for k+1. Assume by contradiction that there exists an edge set T’ such that |T’|=k+1 and W(T’) < W(T).
Let e be the last edge added by Kruskal algorithm. So for any edge f in T’, W(f) < W(e) (otherwise we remove the edges from the 2 sets and get a contradiction).
This can only happen if every edge in T’ is already in T or forms a cycle with T – {e}.
…
Note: It's not the same proof as in Kruskal's algorithm. We don't even know whether T' is connected.
I have no idea what to do next. I would really appreciate any help, Thanks in advance
Let
T’
be an edge set such that|T| = |T’|
. Prove thatW(T) <= W(T’)
.
You'll have a hard time doing that, since it's false in general.
Consider
1
A---B
2 \ / 3
C
| 4
D
Kruskal's algorithm produces the edge set T = { (A,B), (A,C), (C,D) }
, which is the unique minimal spanning tree.
But the edge set T' = { (A,B), (A,C), (B,C) }
has the same cardinality as T
, and
W(T') = 6 < W(T) = 7
There's some condition missing in the problem statement (like that T'
should connect the graph).
You're right. I forgot to mention that there is no cycle in
T'
In that case, T'
spans a tree(1). And since |T'| = |T|
is assumed, the tree that T'
spans connects the graph, i.e. is a spanning tree.
(1) From the absence of cycles, it follows directly that each connected component of T'
is a tree. A tree with n
vertices has n-1
edges. Thus if T'
has k
connected components, the number of vertices in the graph is
V = |T'| + k
But T
is a spanning tree, and |T| = |T'|
, hence
V = |T| + 1 = |T'| + 1
which implies k = 1
.
Thus you are asked to simply prove the correctness of Kruskal's algorithm. You can find proofs in the literature easily, for example on wikipedia.
A proof of correctness (by induction on the number of vertices):
Lemma: Let G
be a connected graph with N > 1
vertices, and T
a minimal spanning tree of G
. Let e
be an edge in T
.
Then T \ {e}
projects to a minimal spanning tree of the graph G'
obtained from G
by identifying the two endpoints a
and b
of e
. Conversely, if T'
is a set of edges of G
that projects to a minimal spanning tree of G'
, then T' ∪ {e}
is a minimal spanning tree of G
.
Proof: Let p : G -> G'
be the projection identifying a
and b
.
Then p(T \ {e})
has no cycles.
Suppose p(T \ {e})
contained a cycle C
. Then p^(-1)(C)
must be a path connecting a
and b
. But then T
would contain the cycle p^(-1)(C) ∪ {e}
, contradicting the premise that T
is a tree.
Thus p(T \ {e})
is a cycle-free set of edges of G'
with cardinality N - 2
, and that implies (see above) that it is a spanning tree.
Let T''
be a minimal spanning tree of G'
and S = p^(-1)(T'')
.
Then S ∪ {e}
has no cycles.
If there were a cycle in S
, that would project to a cycle in T''
, so every cycle in S ∪ {e}
must contain e
. Suppose C
were a cycle in S ∪ {e}
. Then C \ {e}
is a path connecting a
and b
, thus C \ {e}
projects to a cycle in G'
, since a
and b
project to the same vertex of G'
. That contradicts the premise that T''
is a tree.
So S ∪ {e}
is an edge set of cardinality N - 1
without cycles, and hence (see above) a spanning tree of G
.
Then W(T) <= W(S ∪ {e})
since T
is a minimal spanning tree, and thus
W(p(T \ {e})) = W(T \ {e}) <= W(S) = W(T'')
Since T''
is assumed to be a minimal spanning tree of G'
, it follows that equality holds, and that p(T \ {e})
is a minimal spanning tree of G'
, and that S ∪ {e}
is a minimal spanning tree of G
.
Now to the induction to prove the correctness of Kruskal's algorithm:
For a graph with at most two vertices, it is obvious that the algorithm produces a minimal spanning tree.
For n >= 2
, assume the correctness of the algorithm for all connected graphs with at most n
vertices. (Induction hypothesis)
Let G
be a connected graph with n+1
vertices. Let e
be the first edge chosen in the algorithm, and a
and b
its endpoints.
Let G'
be the graph obtained from G
by identifying a
and b
, and p :: G -> G'
the projection.
Let T
be the edge set selected by the algorithm.
Then p(T \ {e})
is the edge set selected by Kruskal's algorithm on G'
. Thus, by the Lemma above, T
is a minimal spanning tree of G
.
(Okay, probably the proof in wikipedia is simpler, but I wanted to produce a different one.)