Search code examples
rggplot2aesthetics

ggplot_2 geom_smooth dropped aesthetics


Suppose the data are as follows: 2 continuous variables (variable_1 and variable_2), a group factor with 4 levels and a treatment factor with 2 levels.

df <- structure(list(group = structure(c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 
2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 1L, 1L, 1L, 
1L, 1L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 
4L), class = "factor", levels = c("Group 1", "Group 2", "Group 3", 
"Group 4")), treatment = structure(c(1L, 2L, 1L, 2L, 1L, 2L, 
1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 
1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 
1L, 2L), levels = c("Treatment A", "Treatment B"), class = "factor"), 
    variable_1 = c(12.8645664574823, 8.16686874306239, 11.4388230805636, 
    10.3558284665561, 7.69223760253185, 11.6723097558928, 11.6388865156122, 
    12.120960086868, 10.8183228744452, 9.1265042092642, 8.27703056872646, 
    7.31717661690565, 8.96649597992123, 9.38671782428591, 10.8796944597753, 
    12.01665144743, 9.36150447965813, 11.526731214996, 8.22753400261583, 
    8.73540671434093, 12.8645664574823, 8.16686874306239, 11.4388230805636, 
    10.3558284665561, 7.69223760253185, 11.6723097558928, 11.6388865156122, 
    12.120960086868, 10.8183228744452, 9.1265042092642, 8.27703056872646, 
    7.31717661690565, 8.96649597992123, 9.38671782428591, 10.8796944597753, 
    12.01665144743, 9.36150447965813, 11.526731214996, 8.22753400261583, 
    8.73540671434093), variable_2 = c(5.05892658390965, 5.05162026955658, 
    6.66406962970376, 5.00276432258104, 4.10367453808581, 7.45682970015026, 
    7.71323058452055, 4.49113941584846, 3.74111194890247, 3.35332563644038, 
    5.20413950314659, 5.57314098291887, 4.69174992877218, 5.20483325428839, 
    5.71013096825912, 4.97822467530198, 3.07788892862307, 7.03413602373209, 
    5.84712142228627, 5.76742125148379, 5.05892658390965, 5.05162026955658, 
    6.66406962970376, 5.00276432258104, 4.10367453808581, 7.45682970015026, 
    7.71323058452055, 4.49113941584846, 3.74111194890247, 3.35332563644038, 
    5.20413950314659, 5.57314098291887, 4.69174992877218, 5.20483325428839, 
    5.71013096825912, 4.97822467530198, 3.07788892862307, 7.03413602373209, 
    5.84712142228627, 5.76742125148379)), class = "data.frame", row.names = c(NA, 
-40L))

I want to get a scatterplot with lm lines calculated for each group level in a colour that corresponds to the treatment levels. That is, I want to get 4 lines of 2 colours.

I use the following code:

ggplot(df, aes(x = variable_1, y = variable_2)) +
  geom_point(aes(color = treatment)) +
  geom_smooth(aes(group = group, color = treatment), method = "lm", se = FALSE)

I get a warning:

geom_smooth() using formula = 'y ~ x'.

The following aesthetics were dropped during statistical transformation: colour

ℹ This can happen when ggplot fails to infer the correct grouping structure in the data.

ℹ Did you forget to specify a group aesthetic or to convert a numerical variable into a factor?

Example plot output The lines in the graph are the same colour (i.e. no colour aesthetic is specified for the lines).

Question: how can I get the desired graph with separate lines for the groups but with the colour set by treatmet?

It's important that the message about dropped aesthetics appeared recently because on previous versions of ggplot2 (I'm not sure which ones, I'm just using code from an old project) this message didn't appear and the graph was the same as I wanted, with 4 lines and 2 colours. I currently have ggplot2 4.3.0 installed.


Solution

  • If I understand you correctly, you want eight lines in total, since there are two treatments in each of 4 groups. In which case, you can map the group aesthetic to the interaction of group and treatment

    ggplot(df, aes(x = variable_1, y = variable_2)) +
      geom_point(aes(color = treatment)) +
      geom_smooth(aes(group = interaction(group, treatment),
                      color = treatment), method = "lm", se = FALSE)
    

    enter image description here

    This can be made a bit more obvious by labelling the group of each line:

    library(geomtextpath)
    
    ggplot(df, aes(x = variable_1, y = variable_2)) +
      geom_point(aes(color = treatment)) +
      geom_textsmooth(aes(group = interaction(group, treatment),
                      color = treatment, label = group), method = "lm", se = FALSE,
                      vjust = -0.2)
    

    enter image description here