Search code examples
rggplot2boxplott-test

How do I show individual points of a boxplot in R?


I have the df1:

              Name        Y_N FIPS  score1 score2
 1:        Alabama         0    1   2633      8
 2:         Alaska         0    2    382      1
 3:        Arizona         1    4   2695     41
 4:       Arkansas         1    5   2039     10
 5:     California         1    6  27813    524
 6:       Colorado         0    8   8609    133
 7:    Connecticut         1    9   5390    111
 8:       Delaware         0   10    858      3
 9:        Florida         1   12  14172    215
10:        Georgia         1   13   9847    308
11:         Hawaii         0   15    720      0
12:          Idaho         1   16    845      7

I would like to perform a T-test to see if score1 differs based on Y_N. I would then like to plot these two against each other. I have made a boxplot that looks like: enter image description here

Instead I want my graph to look like except with confidence bars: enter image description hereI want to now change from a boxplot to a plot that shows all of the individual points and then a mean horizontal line with 95% confidence intervals. How is this done? I would also like to add the text of the p-value in a corner of the graph.

I might try:

text(x = max(df1$Y_N)+1, 
     y = min(df1$score1)+20000, 
     labels = paste0(
                     "\np-value = ",
                     round(coef_lm[2,4],5),            
     pos = 4)

But I realize that coef_lm[2,4],5 are the test-statistics from a linear model. How do I access the outputs of a t-test?


Solution

  • I'm not sure why you added that extra point in your code. But on your original data, you might use ggplot2 and ggpubr.

    Edit Now more like your paint drawing.

    ggplot(df1,aes(x = as.factor(Y_N), y = score1)) + 
      geom_jitter(position = position_jitter(0.1)) + 
      stat_summary(fun.data = "mean_cl_normal", geom = "errorbar", width = 0.3) +
      stat_summary(fun = "mean", geom = "errorbar",  aes(ymax = ..y.., ymin = ..y..), col = "red", width = 0.5) +
      stat_compare_means(method="t.test") + 
      xlab("Group") + ylab("Score 1")
    

    enter image description here

    Original Data

    df1 <- structure(list(Name = structure(1:12, .Label = c("Alabama", "Alaska", 
    "Arizona", "Arkansas", "California", "Colorado", "Connecticut", 
    "Delaware", "Florida", "Georgia", "Hawaii", "Idaho"), class = "factor"), 
        Y_N = c(0L, 0L, 1L, 1L, 1L, 0L, 1L, 0L, 1L, 1L, 0L, 1L), 
        FIPS = c(1L, 2L, 4L, 5L, 6L, 8L, 9L, 10L, 12L, 13L, 15L, 
        16L), score1 = c(2633L, 382L, 2695L, 2039L, 27813L, 8609L, 
        5390L, 858L, 14172L, 9847L, 720L, 845L), score2 = c(8L, 1L, 
        41L, 10L, 524L, 133L, 111L, 3L, 215L, 308L, 0L, 7L)), class = "data.frame", row.names = c("1:", 
    "2:", "3:", "4:", "5:", "6:", "7:", "8:", "9:", "10:", "11:", 
    "12:"))