Search code examples
rggplot2plotlyggplotly

Using plotly with ggpolot2 geom_plot with stat=identity results in empty canvas


I need to include in a shiny app an interactive box and whiskers plot for a dataset with ~46 million rows across 11 groups. I'd like to use ggplot+plotly to achive this. Because ggplot takes way too long to generate the plot (and plotly can't even deal with so much data) i decided to precalculate the quantiles and use those values with ggplot. Here is an example of the quantiles dataset and the ggplot code to produce the boxplot:

quantiles_hw_dt=data.frame(
 stringsAsFactors = FALSE,
   check.names = FALSE,
       dept_id = c("TFWHH9388IU","YGQGI3019WK",
                   "DKGYA0367QU","TOXLN0137AW","XLETL1793EZ","UXYFN1869CM",
                   "LLHPP0112XP","GYKJF2649DH","RKPIE1418HX",
                   "AZOMD4805RL","UZGWY7250YJ"),
          `0%` = c(3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L),
         `25%` = c(8L, 5L, 13L, 7L, 8L, 7L, 6L, 11L, 12L, 9L, 10L),
         `50%` = c(12L, 7L, 20L, 10L, 11L, 9L, 8L, 18L, 19L, 14L, 16L),
         `75%` = c(17L, 9L, 29L, 14L, 16L, 12L, 10L, 25L, 28L, 21L, 23L),
        `100%` = c(63L, 27L, 96L, 48L, 57L, 42L, 34L, 88L, 91L, 76L, 71L)
                )
p=ggplot(quantiles_hw_dt, aes(dept_id)) +
         geom_boxplot(
         aes(ymin = `0%`, lower = `25%`, middle = `50%`, upper = `75%`, ymax = `100%`),
         stat = "identity"
         ) + coord_flip()
         p

enter image description here

However, when i try to convert it to plotly, i get a black canvas:

l <- plotly_build(p)
l$data[[1]]$orientation <- "h"
l

enter image description here

I am aware of some old issues plotly has with coord_flip(), hence the plotly_build approach that i've attempted (after ggplotly failed as well). It seems that it didn't do much. Even removing the coord_flip statement does not not solve the problem. Here's the plotly of the same ggplot but without coord_flip:

enter image description here

What am i missing here? Thanks


Solution

  • I commented yesterday, but you asked a few weeks ago and didn't get any answers. As I stated in my comment, setting the range can really help reduce processing time with Plotly. If you think about it, Plotly has to process all of the data before it can even build the base plot, to establish the range. You won't notice a processing time difference with the size of this example dataset, only in cases where there is a significant amount of data.

    The Plot

    Like ggplot, you can specify x, y, and groups, but you can also specify the metrics.

    Using the method you utilized for ggplot:

    plot_ly(quantiles_hw_dt, type = "box", y = ~dept_id,
            lowerfence = ~`0%`, q1 = ~`25%`, median = ~`50%`, 
            q3 = ~`75%`, upperfence = ~`100%`)
    

    This is the default plot with no styles:

    enter image description here

    Setting the Range

    This is done in the layout. I've extracted the unique values for the y-axis. For the x-axis, I set it 1:100, since it's percentages.

    When I extracted the y-axis labels, I sorted them. When you assign the range this way, whatever order they are in when you assign it--that's the order they will appear in the plot. (They won't be alphabetized, for example, unless you sort them.)

    I also assigned padding, so that the y-axis labels weren't pushed up against the plot.

    # identify the ranges for the plot
    ys <- sort(unique(quantiles_hw_dt$dept_id), decreasing = T)
    xs <- c(0, 100) 
    
    plot_ly(quantiles_hw_dt, type = "box", y = ~dept_id,
            lowerfence = ~`0%`, q1 = ~`25%`, median = ~`50%`, 
            q3 = ~`75%`, upperfence = ~`100%`) %>% 
      layout(xaxis = list(range = xs),
             yaxis = list(categoryarray = ys),
             margin = list(pad = 10))
    

    enter image description here

    To Make it Look like ggplot

    If you want it to look more like ggplot, you can use the information that's in the plot you attempted to create. This doesn't include all of the stylings, but it should be enough to give you an idea of how you could change the style without a whole lot of effort.

    p2 <- ggplotly(p)          # create empty plot to cannibalize styles
    x = p2$x$layout            # extract layout
    
    plot_ly(quantiles_hw_dt, type = "box", y = ~dept_id,
            lowerfence = ~`0%`, q1 = ~`25%`, median = ~`50%`, 
            q3 = ~`75%`, upperfence = ~`100%`) %>% 
      layout(margin = x$margin, plot_bgcolor = x$plot_bgcolor, # attach new styles
           paper_bgcolor = x$paper_bgcolor, font = x$font,
           xaxis = list(showline = x$xaxis$showline,
                        linecolor = x$xaxis$linecolor,
                        gridcolor = x$xaxis$gridcolor,
                        linewidth = x$xaxis$linewidth,
                        zeroline = x$xaxis$zeroline,
                        tickfont = x$xaxis$tickfont,
                        ticklen = x$xaxis$ticklen),
           yaxis = list(showline = x$yaxis$showline,
                        linecolor = x$yaxis$linecolor,
                        gridcolor = x$yaxis$gridcolor,
                        linewidth = x$yaxis$linewidth,
                        zeroline = x$yaxis$zeroline,
                        tickfont = x$yaxis$tickfont,
                        ticklen = x$yaxis$ticklen,
                        title = x$yaxis$title))
    
    

    enter image description here

    (This last plot doesn't include range setting.)