Search code examples
rtreehierarchy

count max levels of a tree below each node in r


In R, I need to get the number of layers below each node in a tree. If my data are:

from,to
A,Z
B,Z
C,A
D,A
E,A
F,D
G,D
H,G
I,C

The results should be:

A 3
B 0
C 1
D 2
E 0
F 0
G 1
H 0
I 0
Z 4

I've been trying to figure something out with data.tree but I can't seem to figure it out, and not sure what other packages would be helpful here. Any help would be much appreciated.


Solution

  • You can get this using the igraph package. You can convert your edgelist to a graph and then compute the distances between nodes. You just want the maximum distance.

    ## Your data
    EdgeList = as.matrix(read.table(text="from to
    A Z
    B Z 
    C A
    D A
    E A
    F D
    G D
    H G
    I C",
    header=TRUE))
    
    ## convert to an igraph graph
    library(igraph)
    g = graph_from_edgelist(EdgeList)
    
    ## Make a function to compute the height of a node
    height = function(v) {
        D = distances(g, to=v, mode="out")
        max(D[D != Inf])
    }
    
    ## Apply it to all nodes
    sapply(V(g), height)
    A Z B C D E F G H I 
    3 4 0 1 2 0 0 1 0 0 
    

    If you really want these in alphabetical order, you can order them with

    H = sapply(V(g), height)
    H[order(names(H))]