I am looking for a time-effective algorithm for this particular problem:
I have undirected graph with up to 10,000 vertexes and about 1-10 edges out of given vertex.
Now I will remove chosen edge from graph and I want to know if the edge I just removed was a bridge - and if so, what are the vertices connected on both sides. I will repeat the step of removing edge frequently, possibly till I get 10 000 disconnected vertices (each time I need the information of bridge). So, for example last edge should inform that it was a bridge with a single vertex on one side and a single vertex on the other.
Preprocessing of data is fully acceptable, the memory costs - within reasonable limits of modern PC - are okay. I am looking for an algorithm that optimize time of edge removal operation.
My tool of trade is C#, but any pseudo-code or idea I would gladly accept
A close relative of this problem is studied in the literature under the name "decremental connectivity". The main difference is that you want to enumerate the new connected components instead of being able to answer connectivity queries.
This paper seems to be the theoretical state of the art, but having skimmed it, I don't think that the new ideas will be useful to you, since you have a graph that's
My recommendation would be a simplified version of Even--Shiloach where the BFS part is omitted (it adds complexity and memory consumption but doesn't seem obviously winning in a sparse graph). The idea is to maintain
If an edge not in the forest gets deleted, then we just delete it. Otherwise, we pick one side of the tree that just got cut (doesn't matter for correctness which side, but you want the smaller one for efficiency; I recommend rooting the spanning trees and using the child), traverse it to provisionally update its label to something new, and then scan all of its edges to determine whether the edge that was just deleted is a bridge to the rest of the tree. If it is, great; traverse the other new tree for reporting the vertex sets. Otherwise, we have to fix up the labels and the tree structure.
For the data structures I recommend a compact adjacency list. Since there are at most 10,000 nodes, a node index will fit in a 16-bit short
. There are up to 100,000 half-edges, unfortunately, so we'll use a 32-bit int
to represent an edge index. We use an int
array to point into a short
array; entry 2*v
is the beginning of the adjacency list for node v
, and entry 2*v+1
is the end. For cache locality we store two extra pieces of data alongside the adjacency list: the connected component label, and the number of descendants in the spanning forest. Order the spanning forest descendants before the other neighbors.
Overall here is what this looks like on a simple graph:
Graph:
0
/ \
/ \
1-----3 2
Spanning forest:
0
/
/
1-----3 2
Arrays:
[| | | | | | | | ]
| | | \ | | | |
| | | \ \ | | \
| \ \ \ | | | |
v v v v v v v v
[0 1 1 3 x 0 1 3 0 2 0 x 0 0 0 1]
^ ^ \ /
| | |
| | adjacency list; descendants (3) before others (0)
| |
| number of spanning forest descendants
|
connected component label
= root of tree in spanning forest
I left x
s to represent a previously deleted edge from 0
to 2
.
(I don't have a lot of experience with C#, so it's possible that the mandatory bounds checking will be a problem, but I can't imagine that pointers would be better, especially on a 64-bit machine.)