Search code examples
rtree

Transform a path of nodes into a tree structure list of lists


I have a data.frame with one column representing a path of nodes, which I would like to transform to a tree. I there any simple function to do so ?

Here's a simple example:

data <- data.frame(
  Name = c("A", "A1", "A2", "A1a", "A1b", "A2a", "A2b", "A2c"),
  Path = c("1", "1,1", "1,2", "1,1,1", "1,1,2", "1,2,1", "1,2,2", "1,2,3")
)

Which I would like to transform to :


nodes <- list(
  list(
    text = "A",
    li_attr = list(id = "1")
    state = list(opened = TRUE),
    
    children = list(
      list(
        text = "A1",
        li_attr = list(id = "1,1")
        state = list(opened = TRUE),
        
        children = list(
          list(
            text = "A1a",
            li_attr = list(id = "1,1,1")),
          list(
            text = "A1b",
            li_attr = list(id = "1,1,2"))
          )),

        list(
            text = "A2",
            li_attr = list(id = "1,2")
            state = list(opened = TRUE),
            
            children = list(
              list(
                text = "A2a",
                li_attr = list(id = "1,2,1")),
              list(
                text = "A2b",
                li_attr = list(id = "1,2,2")),
              list(
                text = "A2c",
                li_attr = list(id = "1,2,3"))
            
        )
      )
    )
  )
)

Solution

  • Package {data.tree} is helpful for working with hierarchical data structures. In your case:

    • add a pathString to your data frame (a slash-separated string like a directory path, where the last letter of your variable name corresponds to the endpoint, and each preceding letter to an upstream folder; finally convert the dataframe to a tree, using as.Node:
    library(data.tree)
    library(dplyr)
    
    the_treedata <- 
        data |>
        rowwise() |>
        mutate(pathString = strsplit(Name, '') |> unlist() |> paste(collapse = '/'))
    
    ## > the_treedata
    ## # A tibble: 8 x 3
    ## # Rowwise: 
    ##   Name  Path  pathString
    ##   <chr> <chr> <chr>     
    ## 1 A     1     A         
    ## 2 A1    1,1   A/1       
    ## 3 A2    1,2   A/2       
    ## 4 A1a   1,1,1 A/1/a     
    ## 5 A1b   1,1,2 A/1/b  
    
    • convert to a data tree:
    my_tree <- my_treedata |> as.Node()
    
    • traverse the tree and Get the result of applying a custom function to each node as a list:
    the_list <- 
        the_tree$Get(\(node) list(text = node$name,
                                  li_attr = list(node$Path),
                                  state = list(opened = TRUE),
                                  children = Map(node$children,
                                                 f = \(child) list(text = child$Name,
                                                                   state = list(opened = TRUE),
                                                                   li_attr = list(id = node$Path)
                                                                   )
                                                 )
                                  ),
                     filterFun = \(node) !is.leaf(node), ## leave nodes already captured via the `children` attribute of their parent nodes
                     simplify = FALSE
                     )