Search code examples
rplotlyscatter-plot

Plotting multiple categorical variables in R Plotly


I am attempting to make a graph comparing max, min, and mean temperatures from 2 different locations (2 unique locations: turnbull and finley) in R's Plotly package. I have been able to build scatter plots for each location independently, but cannot figure out how to plot both sites on the same graph. Here is a link to the data set (referenced as temp_c in the code): https://docs.google.com/spreadsheets/d/1A1HkOVjifYRp62fkMO2Xe8_STzo2rfq4UXfso9kjxfw/edit#gid=0

enter image description here

Here is my code for one of the locations - I would like to plot both locations on one graph:

fig_fin_1 <- plot_ly(temp_c[temp_c$location=="finley",], x = ~date, y = ~max_temp_c, 
                   type = 'scatter', mode = 'lines', 
                   line = list(color = 'transparent'),
                   showlegend = FALSE, name = 'Finley Max') 
fig_fin_1 <- fig_fin_1 %>% add_trace(y = ~min_temp_c, split = c("finley"), type = 'scatter', mode = 'lines', 
                                 fill = 'tonexty', fillcolor='rgba(0,100,80,0.2)', 
                                 line = list(color = 'transparent'),
                                 showlegend = FALSE, 
                                 name = 'Finley Min') 
fig_fin_1 <- fig_fin_1 %>% add_trace(x = ~date, y = ~ave_temp_c, split = c("finley"), type = 'scatter', mode = 'lines', split = c("finley"),
                                 line = list(color='green'),
                                 name = 'Finley Mean') 
fig_fin_1 <- fig_fin_1 %>% layout(title = "Historical Average, Min and Max Temperatures for Finley NWR",
                              paper_bgcolor='rgb(255,255,255)', plot_bgcolor='rgb(229,229,229)',
                              xaxis = list(title = "Date",
                                           gridcolor = 'rgb(255,255,255)',
                                           showgrid = TRUE,
                                           showline = FALSE,
                                           showticklabels = TRUE,
                                           tickcolor = 'rgb(127,127,127)',
                                           ticks = 'outside',
                                           zeroline = FALSE),
                              yaxis = list(title = "Temperature (degrees C)",
                                           gridcolor = 'rgb(255,255,255)',
                                           showgrid = TRUE,
                                           showline = FALSE,
                                           showticklabels = TRUE,
                                           tickcolor = 'rgb(127,127,127)',
                                           ticks = 'outside',
                                           zeroline = FALSE))

fig_fin_1

I've tried adding all the traces from each graph into one graph - I'm not sure how to retain the 2 unique colors representing the locations colors when I do this, and I get a strange 3rd line. I've also tried making a new data frame for each location and temperature treatment/measurement (i.e. turnbull_min_c), but that didn't work - here's my attempt at making graph representing both locations:

enter image description here

Any help would be much appreciated!


Solution

  • Update

    I guess I misunderstood. You want to combine the graphs, not recreate the graph you provided a picture of. I've tried various ways to make this happen. I've only found one method that works every time.

    The data I originally created was so similar that you couldn't see them when they were on the same graph. I modified that data (df1 described in the original answer).

    # mod so stack is more obvious
    df1$min_temp <- ifelse(df1$location == unique(df1$location)[1],
                           df1$min_temp - 5, df1$min_temp)
    df1$max_temp <- ifelse(df1$location == unique(df1$location)[1],
                           df1$max_temp - 5, df1$max_temp)
    df1$ave_temp <- ifelse(df1$location == unique(df1$location)[1],
                           df1$ave_temp - 5, df1$ave_temp)
    

    I used the same map call to create plt1 and plt2. (Although, you could remove the %>% layout and just add it at the end.)

    I should point out that I reduced the opacity for the blue fill from .2 to .1 so that you could see it overlapping the green (otherwise, it was hard to tell that's what happened).

    Then I extracted the trace data from each of these plots so that I could make all of the fill traces first.

    plt1 <- plotly_build(plt1)
    plt2 <- plotly_build(plt2)
    plt1_d <- plt1$x$data # extract all trace data
    plt2_d <- plt2$x$data
    # restack trace data so all fill traces are first
    ndata <- list(plt1_d[[1]], plt1_d[[2]], plt2_d[[1]], plt2_d[[2]],
                  plt1_d[[3]], plt2_d[[3]])
    

    Now that ndata is the new lineup of traces, I replaced the data in plt1.

    plt1$x$data <- ndata
    

    If you kept the layout in the map call (or in your original plot traces you won't need to call the layout again. If you removed it up to this point, you can add it now.

    plt1 %>% 
      # your original layout (I've changed nothing here)
      layout(title = "Historical Average, Min and Max Temperatures for Finley NWR",
             paper_bgcolor='rgb(255,255,255)', plot_bgcolor='rgb(229,229,229)',
             xaxis = list(title = "Date", gridcolor = 'rgb(255,255,255)',
                          showgrid = TRUE, showline = FALSE, showticklabels = TRUE,
                          tickcolor = 'rgb(127,127,127)', ticks = 'outside',
                          zeroline = FALSE),
             yaxis = list(title = "Temperature (degrees C)",
                          gridcolor = 'rgb(255,255,255)',
                          showgrid = TRUE, showline = FALSE, showticklabels = TRUE, 
                          tickcolor = 'rgb(127,127,127)', ticks = 'outside',
                          zeroline = FALSE))
    

    enter image description here


    Original answer

    I'm assuming from what little I know about your data that the field location has two unique values, and that's what you're splitting the graphs with.

    You didn't include data, so I've included the data I used in this answer.

    library(tidyverse)
    library(plotly)
    
    lows <- c(seq(-10, 20, length.out = 7), seq(15, -10, length.out = 5))
    highs <- c(seq(0, 30, length.out = 7), seq(25, 0, length.out = 5))
    set.seed(25)
    df1 <- map2(1:12, rep(c(31, 30), 6),
                   function(j, k) {
                     max_temp <- rnorm(k * 2, highs[j], 1)
                     min_temp <- rnorm(k * 2, lows[j], 1)
                     m <- matrix(c(max_temp, min_temp), nrow = 2, byrow = T)
                     ave_temp <- colMeans(m)
                     data.frame(ave_temp = ave_temp, max_temp = max_temp, 
                                min_temp = min_temp)
                   }) %>% bind_rows() %>% 
      mutate(dat = rep(seq.Date(from = today(), by = 1, length.out = 366), each = 2),
             location = rep(c("A", "B"), 366)) %>% 
      select(dat, location, everything())
    head(df1)
    

    I used map, but lapply does the same thing. This creates 2 separate plots, one object is plt1; the other is plt2.

    map(1:2,
        function(i) {
          df <- filter(df1, location == unique(df1$location)[i]) # filter for subplot
          # the area first
          p <- plot_ly(type = "scatter", mode = "lines", df, x = ~dat, showlegend = F,
                       y = ~min_temp, line = list(color = "transparent")) %>%
            add_trace(type = "scatter", mode = "lines", df, x = ~dat, 
                      y = ~max_temp, fill = "tonexty", 
                      fillcolor = c('rgba(0, 100, 80, .2)', 'rgba(0, 0, 255, .2)')[i],
                      opacity = .2, line = list(color = "transparent")) %>%
            # line only 
            add_trace(type = "scatter", mode = "lines", df, x = ~dat,y = ~ave_temp, 
                      line = list(color = c('rgba(0, 100, 80, 1)', 
                                            'rgba(0, 0, 255, 1)')[i])) %>% 
    
            # your original layout (I've changed nothing here)
            layout(title = "Historical Average, Min and Max Temperatures for Finley NWR",
                   paper_bgcolor='rgb(255,255,255)', plot_bgcolor='rgb(229,229,229)',
                   xaxis = list(title = "Date", gridcolor = 'rgb(255,255,255)',
                                showgrid = TRUE, showline = FALSE, showticklabels = TRUE,
                                tickcolor = 'rgb(127,127,127)', ticks = 'outside',
                                zeroline = FALSE),
                   yaxis = list(title = "Temperature (degrees C)",
                                gridcolor = 'rgb(255,255,255)',
                                showgrid = TRUE, showline = FALSE, showticklabels = TRUE, 
                                tickcolor = 'rgb(127,127,127)', ticks = 'outside',
                                zeroline = FALSE))
          assign(paste0("plt", i), p, envir = .GlobalEnv)
        })
    

    Now all you have to do is stack them.

    subplot(plt1, plt2, nrows = 2) 
    

    enter image description here

    The image of your desired outcome doesn't have x and y-axis labels, but you designated them in your layout. As you can see here, there are no axis labels in the subplot. There is a way to add them via Plotly, but it's much easier to do with the htmltools library.

    Using this method, you may not want each graph to have a title.

    plt2 <- plt2 %>% layout(title = "")
    

    Additionally, the default height via Plotly is 400 px. The default width is 100%. Because of the height, you lose the dynamic resizing. So first, let's change that.

    plt1$sizingPolicy$defaultHeight <- '100%'
    plt2$sizingPolicy$defaultHeight <- '100%'
    

    Now using htmltools, stack them.

    browsable(
      div(div(plt1, style = "width: 100%; height: 49%; float: left;"), 
          div(plt2, style = "width: 100%; height: 49%; float: left;"),
          style = "width: 100vw; height: 100vh;"))
    

    enter image description here