What algorithm produces a graph with N vertices and M edges that's uniformly random in a reasonable time complexity (worst-case quasi-linear)?
The graph is undirected, with neither multi-edges nor loop edges.
N
vertices.M
edges.Since there aren't additional requirements, like automata languages, this is uniformly distributed:
V <- {1, ..., N}
E <- {}
for 1 to M do
edge <- (random(V), random(V))
E.push(edge)
return (V, E)
For undirected graphs without multi-edges or loops, keep generating random edges until there's a valid edge:
V <- {1, ..., N}
E <- {}
for 1 to M do
repeat
source <- random(V)
edge <- (source, random(V \ {source}))
until E does not contain edge
E.push(edge)
return (V, E)
For the destination use random(V \ source)
to exclude loops (the backslash, \
, is set theory notation for set exclusion, so choose V such that it is not in the source set). The E does not contain edge
should not care for the direction. That is, it should consider:
(a, b) = (b, a)
For efficiency of the contains
you should use some hashed structure when implementing.
The problem is the repeat
loop. Depending on how fast random
works and how close M
is to the amount of possible edges, it may take a while. You can speed this up using techniques like the Floyd-Rivest algorithm (also see Algorithm to select a single, random combination of values?).
If M
is rather small, compared to the maximal amount, it should run fast (linear in N
and M
) since you don't get a lot of edge collisions.
Example implementation:
import java.io.PrintStream;
import java.util.*;
import static java.lang.String.format;
public class UniformGraph {
/**
* If creating an edge between two random vertices takes more than this
* many tries, skip the edge. This ensures creating the graph will terminate.
*/
private static final int MAX_ITERATIONS = 10;
/**
* Represents a line segment between point A and point B. Note that points
* may require conversion to a coordinate system. For example, a Cartesian
* plane with coordinates (X, Y) and a maximum X value Mx, a vertex offset
* can be calculated using X + (Y * Mx).
*
* @param a The starting line segment point.
* @param b The ending line segment point.
* @param <T> The data type representing points.
*/
private record Tuple<T extends Comparable<T>>( T a, T b )
implements Comparable<Tuple<T>> {
public String toString() {
return format( "%s -> %s", a(), b() );
}
@Override
public int compareTo( final Tuple<T> o ) {
return a().compareTo( o.a() );
}
}
public UniformGraph() {}
/**
* @param n Number of vertices in the graph.
* @param m Number of edges in the graph.
*/
public Set<Tuple<Integer>> generate( final int n, final int m ) {
assert n > 1;
assert m > 1;
final var mRandom = new Random();
final var edges = new TreeSet<Tuple<Integer>>();
for( int i = 0; i < m; i++ ) {
var iter = 0;
boolean conflict;
Tuple<Integer> edge;
do {
final var vertex1 = mRandom.nextInt( n );
final var vertex2 = mRandom.nextInt( n );
edge = new Tuple<>( vertex1, vertex2 );
conflict = edges.contains( edge ) || vertex1 == vertex2;
}
while( conflict && ++iter < MAX_ITERATIONS );
edges.add( edge );
}
return edges;
}
public void print( final Set<Tuple<Integer>> edges, final PrintStream out ) {
for( final var edge : edges ) {
out.println( edge );
}
}
public static void main( final String[] args ) {
final var graph = new UniformGraph();
final var edges = graph.generate( 20, 9 );
graph.print( edges, System.out );
}
}