Search code examples
rggplot2plotmeanscatter-plot

How to fit multiple average (horizontal) lines in ggplot in R


I want to fit multiple horizontal lines to a plot, where each line represents the mean of a different category in my data.

lets say I have the following data frame

product <- c("A","A","B","B", "A","A", "B","B", "C","C", "D", "D", "C","C", "D", "D")
measurement <- c(120, 122, 42, 44, 119, 118, 45, 43, 280, 281, 502, 501, 279,278, 503, 504)
sample_data <- data.frame(product, measurement)

I would like to create a result as following

ggplot(sample_data, aes(x=seq(length(sample_data$measurement)), y=measurement, colour= product)) +
  geom_point() +
  labs(x = "Data Points") +
  geom_smooth(aes(group= product), formula = y~1, method="lm", col="blue", se=TRUE, size=.005)

enter image description here

I have two questions:

  1. How can I make sure that each line represents the mean of each product?
  2. How can I show the value of each mean line over the line or somewhere in the legend?

Any help would me much appreciated.


Solution

  • If you want to do it all "inside" ggplot, you could do:

    library(geomtextpath)
    
    ggplot(within(sample_data, `Data Points` <- seq(nrow(sample_data))),
           aes(x = `Data Points`, y = measurement, colour = product)) +
      geom_point() +
      geom_textsegment(aes(y = ave(measurement, product), 
                       x = ave(`Data Points`, product, FUN = min),
                       yend = ave(measurement, product),
                       xend = ave(`Data Points`, product, FUN = max), 
                       label = after_stat(y)), 
                       vjust = -0.2, textcolour = "black", linetype = 2)
    

    enter image description here