Search code examples
rplotregressionr-plotly

R Plotly - Plotting Multiple Regression Lines


How do I add multiple regression lines to the same plot in plotly?

I want to graph the scatter plot, as well as a regression line for each CATEGORY

The scatter plot plots fine, however the graph lines are not graphed correctly (as compared to excel outputs, see below)

df <-  as.data.frame(1:19)

df$CATEGORY <- c("C","C","A","A","A","B","B","A","B","B","A","C","B","B","A","B","C","B","B")
df$x <- c(126,40,12,42,17,150,54,35,21,71,52,115,52,40,22,73,98,35,196)
df$y <- c(92,62,4,23,60,60,49,41,50,76,52,24,9,78,71,25,21,22,25)

df[,1] <- NULL

fv <- df %>%
  filter(!is.na(x)) %>%
  lm(x ~ y + y*CATEGORY,.) %>%
  fitted.values()

p <- plot_ly(data = df,
         x = ~x,
         y = ~y,
         color = ~CATEGORY,
         type = "scatter",
         mode = "markers"
) %>%
  add_trace(x = ~y, y = ~fv, mode = "lines")

p
  • Apologies for not adding in all the information beforehand, and thanks for adding the suggestion of "y*CATEGORY" to fix the parallel line issue.

Excel Output https://i.sstatic.net/WYSfC.png

R Output https://i.sstatic.net/SCIJb.png


Solution

  • Try this:

    library(plotly)
    df <-  as.data.frame(1:19)
    
    df$CATEGORY <- c("C","C","A","A","A","B","B","A","B","B","A","C","B","B","A","B","C","B","B")
    df$x <- c(126,40,12,42,17,150,54,35,21,71,52,115,52,40,22,73,98,35,196)
    df$y <- c(92,62,4,23,60,60,49,41,50,76,52,24,9,78,71,25,21,22,25)
    
    df[,1] <- NULL
    
    df$fv <- df %>%
      filter(!is.na(x)) %>%
      lm(y ~ x*CATEGORY,.) %>%
      fitted.values()
    
    p <- plot_ly(data = df,
             x = ~x,
             y = ~y,
             color = ~CATEGORY,
             type = "scatter",
             mode = "markers"
    ) %>%
      add_trace(x = ~x, y = ~fv, mode = "lines")
    
    p
    

    enter image description here