Search code examples
algorithmdata-structuresgraphminimum-spanning-tree

graph - How to obtain the Minimum Weight Connected Subset?


Here is an excise:

Consider the problem of finding a minimum weight connected subset T of edges from a weighted connected graph G. The weight of T is the sum of all the edge weights in T.Give an efficient algorithm to compute the minimum weight connected subset T.

Here are what I have got:

  1. I have to assume the weights are mixed by both positive and negative ones. Only the mix of both kinds of weights can make sense for this excise.

  2. I will sort the edges first, so the negative edges will come first.

  3. I will consider utilise Kruskal's algorithm, but should be with some modifications

  4. Because I welcome negative edges, I will try to add as many negative edges as possible.

  5. In addition, some positive edges may be added to just in case that not all negative edges are connected and they may need some positive edges as bridges.


Ok, above is my thinking. But when I try to get my hands dirty, I get stuck.

How can I always record the possible minimum weights set?

For example,

{0, 1} is with weight -20

{2, 3} is with weight -10

if {1, 3} has weight of 11, then of course I don't want {1, 3}

or if {1, 3} has weight of 9, then I want

With what kind of data structure I can always keep the minimum weight and the vertices for that weight?


It is worth to note that the subset this excise seeks for aim at edges.

Consider the problem of finding a minimum weight connected subset T of edges from a weighted connected graph G

This means that all vertices still need to be included.

Also it is more than a MST. Consider that if a vertex has two edges, one is -1, another is -2. In a normal MST algorithm, only edge of -2 will be taken. But in this excise, both -1 and -2 need to be taken to reduce the overall weight further.


Solution

  • I think your algorithm is mostly correct already, but with slight modifications it becomes trivial to implement.

    First, every negative edge has to be included in order to minimize the resulting weight. Next, calculate the number of connected components c. If c=1, you're done. Otherwise you need extra c-1 positive edges.

    Now, while you were adding negative edges, consider this already as a Kruskal's algorithm process. Every negative edges may unite a couple of trees in the Kruskal's forest. However, you add the negative edge even if its ends belong to the same tree in the Kruskal's forest — unlike the usual Kruskal's algorithm where you only add those edges that unite two different trees.

    After this phase, you're left with a graph of c connected components (they may not be trees anymore). Now just continue the Kruskal's algorithm as usual. Process the positive edges in the increasing order, keeping track of the number of unions the you've made with positive edges. Once this number gets to c-1, you're done.

    By the way, all the process of Kruskal's algorithm can be implemented easily if you represent the forest as disjoint-set data structure. It requires just a couple of lines of code to write, and after that it is trivial to keep track of the number of unions that were made.


    Some pseudocode follows:

    sort(edges);
    c := n;
    for edge in edges:
        if edge.weight < 0:
            if find(edge.firstEnd) != find(edge.secondEnd):
                --c;
            unite(edge.firstEnd, edge.secondEnd);
        else:
            if c == 1: break;
            if find(edge.firstEnd) != find(edge.secondEnd):
                unite(edge.firstEnd, edge.secondEnd);
                --c;
    

    Here unite and find are the functions of disjoint-set data structure.