Search code examples
language-agnosticdirected-acyclic-graphsrepresentation

A human-readable textual representation of a Directed Acycling Graph


There are a bunch of both human- and machine-readable textual representations for a tree -- e.g. a nested list (in various representations -- e.g. JSON and YAML) and XML. Combined with indentation, they make it really easy to imagine the resulting structure.

But I don't see anything of the same level of readability for a Directed Acyclic Graph. It's a more general data structure than a tree so the above formats can't be used (verbatim, anyway).

  • All human-readable representations that I've ever seen were graphical
  • The raw textual representation would be to list all nodes and their connections -- which makes it hard to imagine the graph if there are more than a few nodes

The application I have in mind would be all sorts of flowcharts -- which e.g. naturally emerge in all sorts of planning tasks.


To limit the scope of the question, I'm primarily asking for standard solutions, or at least production-ready and proven to work in some areas of practice. If there are none, any experimental propositions that passed some sort of peer review (e.g. proposed in a published scientific paper) will have to do.


Solution

  • I'm going to go with a YAML nested list with anchors. (Which is equivalent to XML with entities but the latter has more noise.)
    (I was already considering it but was wondering if anything better has been invented. Looks like it hasn't. But most importantly, @Patrick87 formally showed that it's an adequate representation.)

    It's equivalent to the formal regular expression representation suggested by @Patrick87 if I replace composition with indent and union with no indent; and the anchors allow to eliminate the duplication of a subgraph under a node when it's referenced multiple times.


    E.g. @GuyCoder's example

    A->B
    A->C
    A->D
    B->E
    B->F
    C->E
    C->G
    D->F
    D->G
    E->H
    F->H
    G->H
    

    that corresponds to A(B(E+F)+C(E+G)+D(F+G))HA(B(EH+FH)+C(EH+GH)+D(FH+GH))

    would be

    - A
        - B
            - &E E
                - &H H
            - &F F
                - *H
        - C
            - *E
            - &G G
                - *H
        - D
            - *F
            - *G
            
    

    (For uniformity, every original node could be made an anchor if e.g. generating it.)


    It doesn't matter if the graph is planar or not because any cross-cutting links are simply not "drawn".

    As a bonus, it allows to specify the data attached to each node, as a hash table rooted at the node. (Though past a certain size, it'll probably be more clear to place the data separately.)