I have a dataframe with a numerical variable and a factor variable, like this:
set.seed(123)
df <- data.frame(
numbers = c(rnorm(50, 3), runif(50)),
levels = sample(LETTERS[1:5], 100, replace = T)
)
What I'd like to do is a stripchart that plots df$numbers
against df$levels
and inserts vertical segment lines representing the mean for each level.
stripchart(df$numbers ~ df$levels, method = "jitter")
Obviously, I could insert the means line for each level separately, e.g.:
segments(x0 = mean(df$numbers[df$levels=="A"]), y0 = 1-0.3, y1 = 1+0.3, col = "red" )
And so on for all other levels, which is tedious if you have multiple levels. So I've tried this for
loop:
for(i in seq(unique(df$levels))){
segments(x0 = mean(df$numbers[df$levels==i]),
y0 = i - 0.3,
y1 = i + 0.3,
col = "red", lty = 3, lwd = 2)
}
But that doesn't print anything (and doesn't throw an error either). What's the cleanest and simplest code to insert the means segments?
As the 'levels' column is factor
, use levels
to get the levels of the factor
'un1', then loop over the sequence of unique elements, get the mean
of the 'numbers' where the levels
column is the unique value to create the segments
un1 <- levels(df$levels)
for(i in seq_along(un1)){
segments(x0 = mean(df$numbers[df$levels==un1[i]]),
y0 = i - 0.3,
y1 = i + 0.3,
col = "red", lty = 3, lwd = 2)
}
-checking the mean
with(df, tapply(numbers, levels, FUN = mean))
# A B C D E
#1.390202 1.541655 2.086605 2.377122 1.663159