Search code examples
rhierarchical-data

Hierarchical tree diagram in R


I am looking for an R function to build the specific type of diagram shown here. Data.tree seems promising, but I'm stuck.

My goal (shown in the image) is merely a hierarchical diagram showing counts of values in different categories from a dataframe, getting more specific as you go down. It is not a decision tree or a flow chart. What's important is something that will tabulate the count of features in each category given a dataframe and the variables I want in each level.

here's a sample of my data:

tree_data = data.frame(Context = c("urban", "rural", "urban", "urban", "rural", "rural"),
                Lighting = c("daylight", "dark", "dark", "daylight", "daylight", "dark"),
                Driver_age = c("Senior", "Adult", "Adult", "Adult", "Adult", "Senior"))

And the desired output:

enter image description here

I have gotten this far with data.tree:

tree_data$pathString = paste("crashes",
                             tree_data$Context,
                             tree_data$Lighting,
                             tree_data$Driver_age,
                             sep = "/")

crashes = as.Node(tree_data)
print(crashes)

The result is nicely organized, but I'm not sure how to add counts, or get it into a visual format like above.

            levelName
1  crashes           
2   ¦--urban         
3   ¦   ¦--daylight  
4   ¦   ¦   ¦--Senior
5   ¦   ¦   °--Adult 
6   ¦   °--dark      
7   ¦       °--Adult 
8   °--rural         
9       ¦--dark      
10      ¦   ¦--Adult 
11      ¦   °--Senior
12      °--daylight  
13          °--Adult 

Can someone advise on next steps? Or if there is a better package, I'm open to it. I also tried diagrammeR and igraph but they did not seem like solutions that I'd be able to easily apply to different datasets. I'll need this to be something easily repeatable.


Solution

  • We could do this easily with vtree, See the documentation https://cran.r-project.org/web/packages/vtree/vignettes/vtree.html

    Here is an example:

    #install.packages("vtree")
    
    library(vtree)
    tree_data = data.frame(Context = c("urban", "rural", "urban", "urban", "rural", "rural"),
                           Lighting = c("daylight", "dark", "dark", "daylight", "daylight", "dark"),
                           Driver_age = c("Senior", "Adult", "Adult", "Adult", "Adult", "Senior"))
    
    vtree(tree_data, c("Context", "Lighting"))
    
    

    enter image description here