Search code examples
graphgephigraph-visualization

Layout for a family tree


I have a dataset of DNA relationships (as a percent match) between myself and few hundred relatives, almost all distant relatives. I also have data on DNA relationships between each of them and certain other members in the dataset.

I'm hoping to build a network graph that shows the interrelationships and have Gephi build something that loosely resembles a family tree. But even using a small sample database I can't get the resulting graph to look anything like that.

I want each relationship (i.e. edge) to have a "force" related to the closeness of the relationship, so distant relatives (nodes) are pushed further away. I want the graph to self-assemble based on these "forces" and assume there is a layout for this, but I haven't found one.

I'm currently putting the DNA relationship in the weight column, and not using the interval column at all. But even using just 8 relatives and artificially perfect data I have to manually move nodes around to make it look remotely useful.

What layout should I use for this type of graph, and what other advice can you offer to make this work? Should the weight field increase or decrease as relationship distance increases?


Solution

  • … and have Gephi build something that loosely resembles a family tree. But even using a small sample database I can't get the resulting graph to look anything like that.

    A family tree connects descendants (mostly). DNA similarity (as a percentage) does not conform to this structure. Related questions may be answered here.

    Setting a Library > Edges > Edge Weight -filter to the DNA similarity attribute may help (but will not produce "something that loosely resembles a family tree").

    I want each relationship (i.e. edge) to have a "force" related to the closeness of the relationship, so distant relatives (nodes) are pushed further away. I want the graph to self-assemble based on these "forces" …

    All layouts work like that. However, Gephi does not feature hierarchical positioning. 3rd party candidates include EventGraphLayout, Layered Layout and Concentric Layout.

    Should the weight field increase or decrease as relationship distance increases?

    The greater an edge's weight, the stronger its connection (resulting in less distance between the nodes it connects). To a family tree however this is irrelevant.

     

    I'm hoping to build a network graph that shows the interrelationships between each member …

     

    What layout should I use for this type of graph, and what other advice can you offer to make this work?

    Following steps emphasize clustering and modularity:

    1. Calculate modularity.

      enter image description here

    2. Color nodes by modularity class:
      Appearance > Nodes > Partition > Modularity Class

    3. Apply a layout; ForceAtlas 2 for example (with Dissuade Hubs, LinLog mode and Prevent Overlap enabled).

    Apply the Contraction layout afterwards if necessary. Optionally set node size according to (for example) Eigenvector Centrality (prior to applying layout).