Search code examples
rfor-loopstripchart

Print segments for factor levels into stripchart in base R


I have a dataframe with a numerical variable and a factor variable, like this:

set.seed(123)
df <- data.frame(
  numbers = c(rnorm(50, 3), runif(50)),
  levels = sample(LETTERS[1:5], 100, replace = T)
)

What I'd like to do is a stripchart that plots df$numbersagainst df$levels and inserts vertical segment lines representing the mean for each level.

stripchart(df$numbers ~ df$levels, method = "jitter")

Obviously, I could insert the means line for each level separately, e.g.:

segments(x0 = mean(df$numbers[df$levels=="A"]), y0 = 1-0.3, y1 = 1+0.3, col = "red" )

And so on for all other levels, which is tedious if you have multiple levels. So I've tried this forloop:

for(i in seq(unique(df$levels))){
  segments(x0 = mean(df$numbers[df$levels==i]),
           y0 = i - 0.3,
           y1 = i + 0.3,
           col = "red", lty = 3, lwd = 2)
}

But that doesn't print anything (and doesn't throw an error either). What's the cleanest and simplest code to insert the means segments?


Solution

  • As the 'levels' column is factor, use levels to get the levels of the factor 'un1', then loop over the sequence of unique elements, get the mean of the 'numbers' where the levels column is the unique value to create the segments

    un1 <- levels(df$levels)
    for(i in seq_along(un1)){
     segments(x0 = mean(df$numbers[df$levels==un1[i]]),
           y0 = i - 0.3,
           y1 = i + 0.3,
           col = "red", lty = 3, lwd = 2)
    }
    

    enter image description here

    -checking the mean

    with(df, tapply(numbers, levels, FUN = mean))
    #      A        B        C        D        E 
    #1.390202 1.541655 2.086605 2.377122 1.663159