Search code examples
rr-plotly

How to use multiple groups in plotly but only a defined number of legendgroups


Assuming I have a time series of multiple different cells, which I can split according to if they received a treatment or not - how can I plot all individual time series (without averaging) but group them according to the treatment in plotly?

It works perfectly with ggplot - and I know I can use ggplotly from there - but is there a full plotly way?

Here are some dummy data:

library("dplyr")
library("plotly")

cell <- c(rep("a", 10), rep("b", 10),  rep("c", 10),  rep("d", 10),  rep("e", 10), rep("f", 10))
group <- c(rep("Untreated", 10), rep("Treated", 30), rep("Unknown", 20))
time <- rep(1:10, times=6)
value <- c(runif(60))

df <- data.frame(cell, group, time, value)



# I want this in plotly:
ggplot(df, aes(x=time, y=value, group=cell, color=group)) +
  geom_line()


# For many "cells" this will explode the legend (my real data have hundreds of cells)
plot_ly(df, x=~time, y=~value, split=~cell, color=~group,
        type="scatter", mode="line")


# This works but it connects the last and the first timepoint
plot_ly(df, x=~time, y=~value, group=~cell, color=~group,
        type="scatter", mode="line")

I need this: ggplot

But plotly gives me this (too many legendgroups for many cells): Plotly1

Or this (it connects start and end of individual cells): Plotly2

Is there any way plotly can do it - or do I need to use ggplotly?

Edit:

  • New and extended dummy data
  • New plots

Solution

  • You can use legendgroup and hide the duplicated trace names via showlegend.

    Also see: https://plotly.com/r/legend/#grouped-legend

    Edit: After @JulianStopp modified the example data: Here is a generalized approach to find the traces to hide in the legend. Sorry for switching to data.table but I'm not familiar with dplyr:

    library(data.table)
    library(plotly)
    
    cell <- c(rep("a", 10), rep("b", 10),  rep("c", 10),  rep("d", 10),  rep("e", 10), rep("f", 10))
    group <- c(rep("Untreated", 10), rep("Treated", 30), rep("Unknown", 20))
    time <- rep(1:10, times=6)
    value <- c(runif(60))
    
    DF <- data.frame(cell, group, time, value)
    
    setDT(DF)
    setorder(DF, group, cell, time)
    DF[, rleid := rleid(get("cell")), by = "group"][, showlegend := fifelse(rleid == 1L, yes = TRUE, no = FALSE)]
    DF[, trace_index := .GRP, by = .(group, cell)] # create trace indices
    
    p <- plot_ly(DF, x=~time, y=~value, split=~cell, legendgroup = ~group, name = ~group, color = ~group,
            type="scatter", mode="line") %>% style(showlegend = FALSE, traces = unique(DF[showlegend == FALSE, trace_index]))
    
    print(p)
    

    result


    Initial answer:

    library("dplyr")
    library("plotly")
    
    cell <- c(rep("a", 10), rep("b", 10),  rep("c", 10))
    group <- c(rep("Untreated", 10), rep("Treated", 20))
    time <- c(seq(1:10), seq(1:10), seq(1:10))
    value <- c(runif(30))
    
    df <- data.frame(cell, group, time, value)
    
    plot_ly(df, x=~time, y=~value, split=~cell, legendgroup = ~group, name = ~group, color = ~group,
            type="scatter", mode="line") %>% style(showlegend = FALSE, traces = 2)
    

    result