Search code examples
rplotgraphigraphgenealogy

Plot a genealogy in R


d = data.frame(
   offspring = c("G2I1", "G2I2", "G2I3", "G3I1", "G3I2", "G3I3", "G3I4", "G4I1", "G4I2", "G4I3", "G4I4", "G5I1", "G5I2", "G5I3"  ),
   parent1   = c("G1I1", "G1I2", "G1I1", "G2I1", "G2I3", "G2I1", "G2I3", "G3I2", "G3I2", "G3I1", "G3I4", "G4I3", "G4I3", "G4I1" ),
   parent2   = c("G1I3", "G1I2", "G1I2", "G2I2", "G2I2", "G2I2", "G2I3", "G3I4", "G3I1", "G3I2", "G3I4", "G4I1", "G4I1", "G4I2" )
)

print(d)

        offspring parent1 parent2
    1       G2I1    G1I1    G1I3  # generation 2
    2       G2I2    G1I2    G1I2  # generation 2
    3       G2I3    G1I1    G1I2  # generation 2
    4       G3I1    G2I1    G2I2  # generation 3
    5       G3I2    G2I3    G2I2  # generation 3
    6       G3I3    G2I1    G2I2  # generation 3
    7       G3I4    G2I3    G2I3  # generation 3
    8       G4I1    G3I2    G3I4  # generation 4
    9       G4I2    G3I2    G3I1  # generation 4
    10      G4I3    G3I1    G3I2  # generation 4
    11      G4I4    G3I4    G3I4  # generation 4
    12      G5I1    G4I3    G4I1  # generation 5
    13      G5I2    G4I3    G4I1  # generation 5
    14      G5I3    G4I1    G4I2  # generation 5

Data representation

This data represents a genealogy. Each line indicates an offspring and its two parents. I call them parent1 and parent2 because they are hermaphrodites. Also, they can clone themselves! Generations are non overlapping, meaning that all parents of offspring of the generation n were born in the generation n-1.

Let's consider an example. Individual G3I4 was born in generation 3 (G3) and is the individual index 4 of this generation (I4; the index is just an ID). This individual is parent of individual G4I1 and of individual G4I4. In fact, G3I4 is the only parent of G4I4 as she cloned herself.

Question

How can I graph this genealogy in R?

Related post

The post How to plot family tree in R is very related but I failed to apply it to my data. The first question uses igraph which I am not very familiar with. But I failed to get anything good looking

d = tibble(
       offspring = c("G2I1", "G2I2", "G2I3", "G3I1", "G3I2", "G3I3", "G3I4", "G4I1", "G4I2", "G4I3", "G4I4", "G5I1", "G5I2", "G5I3"  ),
       parent1   = c("G1I1", "G1I2", "G1I1", "G2I1", "G2I3", "G2I1", "G2I3", "G3I2", "G3I2", "G3I1", "G3I4", "G4I3", "G4I3", "G4I1" ),
       parent2   = c("G1I3", "G1I2", "G1I2", "G2I2", "G2I2", "G2I2", "G2I3", "G3I4", "G3I1", "G3I2", "G3I4", "G4I1", "G4I1", "G4I2" )
    )

d2 = data.frame(from=c(d$parent1,d$parent2), to=rep(d$offspring,2))
g=graph_from_data_frame(d2)
co=layout.reingold.tilford(g, flip.y=T)
plot(g,layout=co)

enter image description here

but some individuals that don't leave any offspring are missing from the graph.

The second answer uses kinship2. To my understanding, kinship2 can't deal with asexual reproduction.


Solution

  • The only thing I see wrong is the overlapping nodes in G1. With more information I am happy to tweak the output as necessary.

    library(igraph)
    d = data.frame(
      offspring = c("G2I1", "G2I2", "G2I3", "G3I1", "G3I2", "G3I3", "G3I4", "G4I1", "G4I2", "G4I3", "G4I4", "G5I1", "G5I2", "G5I3"  ),
      parent1   = c("G1I1", "G1I2", "G1I1", "G2I1", "G2I3", "G2I1", "G2I3", "G3I2", "G3I2", "G3I1", "G3I4", "G4I3", "G4I3", "G4I1" ),
      parent2   = c("G1I3", "G1I2", "G1I2", "G2I2", "G2I2", "G2I2", "G2I3", "G3I4", "G3I1", "G3I2", "G3I4", "G4I1", "G4I1", "G4I2" ),
      stringsAsFactors = F
    )
    
    d2 = data.frame(from=c(d$parent1,d$parent2), to=rep(d$offspring,2))
    g=graph_from_data_frame(d2)
    #co=layout.reingold.tilford(g, flip.y=T)
    co1 <- layout_as_tree(g, root = which(grepl("G1", V(g)$name)))
    #plot(g,layout=co, edge.arrow.size=0.5)
    plot(g,layout=co1, edge.arrow.size=0.25)
    

    enter image description here