Search code examples
rplotlyboxplotr-plotly

R plotly simple boxplot highlighting the most recent value


I probably have a simple question but I can't find a way to achieve what I need. I have a simple boxplot as the following:

end_dt <- as.Date("2021-02-12")
start_dt <- end_dt - (nrow(iris) - 1)
dim(iris)
dates <- seq.Date(start_dt, end_dt, by="1 day")
df <- iris
df$LAST_VAL <- "N"
df[3, 'LAST_VAL'] <- "Y"

df1 <- df[,c("Sepal.Length","LAST_VAL")]
df1$DES <- 'Sepal.Length'
colnames(df1) <- c("VALUES","LAST_VAL","DES")

df2 <- df[,c("Sepal.Width","LAST_VAL")]
df2$DES <- 'Sepal.Width'
colnames(df2) <- c("VALUES","LAST_VAL","DES")

df <- rbind(df1, df2)

fig <- plot_ly(df, y = ~VALUES, color = ~DES, type = "box") %>% layout(showlegend = FALSE)  

What I would like to do now is a add a red marker to each box plot just for the value corresponding to LAST_VAL = "Y". This would allow me to see given the distribution of each plot, to see where the most recent value is located. I tried to use the info on https://plotly.com/r/box-plots/ but I can't figure out how to do this. Thanks


Solution

  • The following solution ended up to be a bit too long codewise. However, it should give you what you asked for. I think the boxplots should be added afterwards, like:

    fig <- plot_ly(df[df$LAST_VAL=="Y",], 
                   x=~DES, y = ~VALUES, color = ~DES, type = "scatter", colors='red') %>% 
    layout(showlegend = FALSE) %>% 
    add_boxplot(data = df[df$DES=="Sepal.Length",], x = ~DES, y = ~VALUES, 
                  showlegend = F, color = ~DES,
                  boxpoints = F, fillcolor = 'white', line = list(color = c('blue'))) %>% 
    add_boxplot(data = df[df$DES=="Sepal.Width",], x = ~DES, y = ~VALUES, 
                  showlegend = F, color = ~DES,
                  boxpoints = F, fillcolor = 'white', line = list(color = c('green')))
    

    enter image description here