Search code examples
rplotlyr-plotlysunburst-diagram

How do you plot values in mupltiple columns into a sunburst chart using plotly library?


I have my data in the following form:

DF <- data.frame(Name = c("Joe", "Albert", "Marianne", "Anastasia", "John"),
                 Total = c(36, 21, 38, 30, 29),
                 Total1 = c(16, 10, 24, 18, 22),
                 Subtotal11 = c(8, 6, 14, 5, 20),
                 Subtotal12 = c(8, 4, 10, 13, 2),
                 Total2 = c(20, 11, 14, 12, 7),
                 Subtotal21 = c(8, 4, 14, 11, 2),
                 Subtotal22 = c(12, 7, 0, 1, 5))

DF
   Name        Total Total1 Subtotal11 Subtotal12 Total2 Subtotal21 Subtotal22
1  Joe         36    16     8          8          20     8          12
2  Albert      21    10     6          4          11     4          7
3  Marianne    38    24     14         10         14     14         0
4  Anastasia   30    18     5          13         12     11         1
5  John        29    22     20         2          7      2          5

How would you plot the sunburst chart using the plotly library and the data above?


Solution

  • The biggest problem here is the lack of hierarchical information in your data (which columns are root, branches or leaves?). Currently the hierarchical structure needs to be assumed by the column names and the values - as you can see I added a column parents to provide the structure as I think it should be:

                        Name value                                        parents
     1:            All Total   154                                           <NA>
     2:            Joe Total    36                                      All Total
     3:         Albert Total    21                                      All Total
     4:       Marianne Total    38                                      All Total
     5:      Anastasia Total    30                                      All Total
    ...
    32:       Joe Subtotal22    12             All Total - Joe Total - Joe Total2
    33:    Albert Subtotal22     7       All Total - Albert Total - Albert Total2
    34:  Marianne Subtotal22     0   All Total - Marianne Total - Marianne Total2
    35: Anastasia Subtotal22     1 All Total - Anastasia Total - Anastasia Total2
    36:      John Subtotal22     5           All Total - John Total - John Total2
    

    Here is what I think you are after:

    library(data.table)
    library(plotly)
    
    DF <- data.frame(Name = c("Joe", "Albert", "Marianne", "Anastasia", "John"),
                     Total = c(36, 21, 38, 30, 29),
                     Total1 = c(16, 10, 24, 18, 22),
                     Subtotal11 = c(8, 6, 14, 5, 20),
                     Subtotal12 = c(8, 4, 10, 13, 2),
                     Total2 = c(20, 11, 14, 12, 7),
                     Subtotal21 = c(8, 4, 14, 11, 2),
                     Subtotal22 = c(12, 7, 0, 1, 5))
    
    setDT(DF)
    
    DT <- melt.data.table(DF, id.vars = "Name")
    DT[, Name := paste(Name, variable)]
    
    total_sum <- sum(DT[variable %in% "Total"]$value)
    DT[, variable := NULL]
    DT <- rbindlist(list(list(Name = "All Total", value = total_sum), DT))
    
    DT[, parents := c(
      NA_character_,
      rep("All Total", 5),
      "All Total - Joe Total",
      "All Total - Albert Total",
      "All Total - Marianne Total",
      "All Total - Anastasia Total",
      "All Total - John Total",
      rep(
        c("All Total - Joe Total - Joe Total1",
          "All Total - Albert Total - Albert Total1",
          "All Total - Marianne Total - Marianne Total1",
          "All Total - Anastasia Total - Anastasia Total1",
          "All Total - John Total - John Total1"),
        2
      ),
      "All Total - Joe Total",
      "All Total - Albert Total",
      "All Total - Marianne Total",
      "All Total - Anastasia Total",
      "All Total - John Total",
      rep(
        c("All Total - Joe Total - Joe Total2",
          "All Total - Albert Total - Albert Total2",
          "All Total - Marianne Total - Marianne Total2",
          "All Total - Anastasia Total - Anastasia Total2",
          "All Total - John Total - John Total2"),
        2
      )
    )]
    
    DT[, ids := fifelse(is.na(parents), yes = Name, no = paste(parents, Name, sep =" - "))]
    
    plot_ly(data = DT, ids = ~ids, labels= ~Name, parents = ~parents, values= ~value, type='sunburst', branchvalues = 'total')
    

    Result

    However, you might want to have a look at this answer for input formats capable of properly representing hierarchical data.

    Furthermore please check the documentation here under the section Sunburst with Repeated Labels.