Search code examples
erlangsubsetdirected-graphsubgraphpowerset

Generate all possible subgraphs of a directed graph keeping the number of vertices


I have two lists of vertices: V and S.

I would like to generate all possible directed graphs from V and S so, that each vertex from V has only one out-edge and exactly one in-edge, and each vertex from S can have any number of in- and out- edges. Each graph in the result should contain exactly all vertices from V and from S. The result can contain both connected and disconnected graphs.

First I thought it was a powerset-related problem, but powerset has many other sets that may contain just one element (and I do not need those).

My current strategy is to:

  • find all pairs between vertices from V, add to Pairs;
  • find all pairs between vertices from S, add to Pairs;
  • find all pairs between vertices from V and S, add to Pairs;
  • generate subsets of Pairs of size not less than V in a such way, that each subset has exactly one instance of the vertex v in the first position, one instance of the vertex v in the second position and any number of instances of any vertex s from S in any position.

I am not sure this is right and I would like to know about any ideas.

Maybe I could create a fully-connected graph G from V and S and then somehow extract subgraphs from it? (perhaps with the help of digraph:utils)

P.S. I am trying to solve this in Erlang, because it is the language that I am using and actively learning now. But I would be glad to see Java, Ruby or pseudocode in the answer.


Solution

  • Well, it is very nice problem, but you can do some very nice tricks there. You can present graph as square matrix of 0 and 1 where number of rows and columns is number of vertices. Each 1 presents edge from vertex in row to vertex in column. Then every possible graph is one big binary number with N^2 bits i.e. there is 2^(N^2) possible graphs made from N vertices. Now you can divide your issue to two parts.

    1. Generate all possible graphs from S - each graph is just one N^2 bit number.
    2. Then for each this graph you expand matrix by V rows and columns and multiply possibilities by combinations where is exactly one 1 in each V row and column.

    I need one clarification from you. You wrote ... and each vertex from S can have any number of in- and out- edges Is 0 included in any number? Then I don't understand result should contain exactly all vertices from V and from S Constrain to S doesn't have sense because each S is included in each solution as vertex with zero in and out edges. Otherwise it's not all N^2 bit numbers but only those which have at least one 1 in each row and column and then you can't divide your issue to two parts and you have to solve S and V together. Then it may be easier satisfy V rows and columns first (exactly one 1 in row and column) and then multiply each this solution by SxS matrix solutions which satisfy S constrain when you have to have at least one 1 in row and column.

    You can experiment with various representations as list of lists for small number of vertices. You can try array module when you will compute index as R+C*N or you can try array of array. For bigger numbers of vertices using array of binaries can be doable. You can even try digraph module here. Similar approach worked well for generating full closure graphs in perl where binary manipulations using vec sucks. But it doesn't seem case here because it is very different issue. You may found some better optimized presentations but I doubt Erlang is best tool for this sort of computations very efficiently. Anyway number of possibilities is growing pretty fast here O(2^(N^2)) so you don't have to worry about efficient storage of big matrices ;-)

    Edit:

    Look at problem from this perspective. If you have 10 vertices and assume it is V and S is empty. Then there is 10! possible graphs i.e. 3628800. If it would be S there is approx 2^100 i.e. 1.2677e+30 graphs (4.0197e+13 years if you will generate 1e+09 graphs per second). It means for any number of vertices bigger than very few you have very big number of possible graphs. So biggest question here is, what will you do with them. You can't store them but if yes so you have to store them very efficiently. Binary field is most efficient way for storing graphs made from S. You can find more efficient way for edges with V vertices. I would store it as array or list where position is vertex where edge goes from and value is vertex where edge goes to.

    So your solution strongly depend what you will do with result. I think you will filter out results in some way because I can't imagine what you will do with so big number of graphs ;-) Mine advice is try filter out meaningful graphs as early as possible during graphs generation. It means your approach should be determined by purpose to enable involve your result filtering in generating algorithm.

    And about Erlang efficiency for this problem, if you deal with so enormous number of entities (possible graphs) you have to manage your memory and CPU usage very carefully. You can use Erlang but for some time critical parts you should use C in NIFs. You also should use more memory friendly data structures as binaries or bignums instead of lists or tuples. You also can find some other languages as C, C++, OCaml, Haskell, ... more suitable for such memory and computation intensive task.