Search code examples
rggplot2data-fitting

Non-linear fits in facet_grid function in ggplot2 R


For column X, all the plots need to be fit to a linear regression, whereas for Y to a mix of linear and quadratic. However, in the plot all fits are quadratic. Also there are no fit lines in some cases. Any suggestions to tweaking the code?

set.seed(14)
df <- data.frame(
  var.test = rep(c("X","Y"),each =  40),
  var.test2 = rep(c("A","B","C","D"), times = 20),
  val.test = rnorm(80,4,5),
  x = c(1:80)
)

my.formula <- c(y~x,y~x,y~x,y~x,y~x + I(x^2), y~x + I(x^2),y~x + I(x^2), y ~x)

ggplot(df, aes(x = x, y = val.test)) + 
  geom_point() +
  mapply(function(x, z) {
    data <- subset(df, var.test == z)
    list(
      geom_smooth(
        method = "glm", data = data, formula = x,
        method.args = list(family = "poisson"), color = "black"
      ),
      stat_poly_eq(formula = x, aes(label = ..eq.label..), 
                   parse = TRUE, size = 2.5, col = "black", data = data, vjust = -0.1),
      stat_poly_eq(formula = x, aes(label = ..rr.label..), 
                   parse = TRUE, size = 2.5, col = "black", data = data, vjust = 1.1)
    )
  }, my.formula, c("A", "B", "C","D","X","Y")) + 
  facet_grid(var.test2~var.test)
     

enter image description here


Solution

  • There are two problems as far as I can see. Firstly, your regression is a Poisson regression, but your data is not count data - it contains negative values, so it sometimes fails altogether, and is inappropriate even when there are no negative values. I appreciate this is just sample data, so it may not be relevant to your real use case.

    Secondly, and more importantly for your real use case, your mapping function is not quite right. You need to subset by both faceting variables, so that you have 8 unique sets of layers - one for each interaction of var.test and var.test2.

    ggplot(df, aes(x = x, y = val.test)) + 
      geom_point() +
      mapply(function(x, z, v) {
        data <- subset(df, var.test2 == z & var.test == v)
        list(
          geom_smooth(
            method = "glm", data = data, formula = x,
            method.args = list(family = "gaussian"), color = "black"
          ),
          stat_poly_eq(formula = x, aes(label = ..eq.label..), 
                       parse = TRUE, size = 2.5, col = "black", 
                       data = data, vjust = -0.1),
          stat_poly_eq(formula = x, aes(label = ..rr.label..), 
                       parse = TRUE, size = 2.5, col = "black",
                       data = data, vjust = 1.1)
        )
      }, 
      x = my.formula, 
      z = rep(c("A", "B", "C","D"), 2), 
      v = rep(c("X", "Y"), each = 4)) + 
      facet_grid(var.test2~var.test)
    

    enter image description here