Search code examples
rt-test

One sample T-TEST IN R within groups


I have been doing one sample t-tests in R but today I have got one big challenge. I have data grouped in by one variable and I want to perform a one sample t-test per group. I can do this perfectly well in SPSS but it's now a headache in R, whoever knows how to do this to assist.Sample scenario

Location=rep(c("Area_A","Area_B"),4) 
temp=rnorm(length(Location),34,5) 
sample_data=data.frame(Location,ph)
sample_data
Location       temp
1   Area_A 32.73782
2   Area_B 26.29996
3   Area_A 40.75101
4   Area_B 26.68309
5   Area_A 33.94259
6   Area_B 26.48326
7   Area_A 37.92506
8   Area_B 29.22532

Say the hypothesised mean in the above example is 35 ,the one sample t test would be,

t.test(sample_data$temp,mu=35)

which gives me

 One Sample t-test

data:  sample_data$ph
t = -1.6578, df = 7, p-value = 0.1413
alternative hypothesis: true mean is not equal to 35
95 percent confidence interval:
 27.12898 36.38304
sample estimates:
mean of x 
 31.75601

But this is for all the groups combined. I can do it in SPSS. Is there any way to do this in R with a line of code or if not possible with a single line of code, who can do this for me. Thanks in advance.


Solution

  • One solution is to save t.test results per group as a list:

    # reproducible results
    set.seed(8)
    
    # example data
    Location=rep(c("Area_A","Area_B"),4) 
    temp=rnorm(length(Location),34,5) 
    sample_data=data.frame(Location,temp)
    
    library(dplyr)
    
    dt_res = sample_data %>%
      group_by(Location) %>%                       # for each group
      summarise(res = list(t.test(temp, mu=35)))   # run t.test and save results as a list
    
    # see the list of results
    dt_res$res  
    
    # [[1]]
    # 
    # One Sample t-test
    # 
    # data:  temp
    # t = -0.76098, df = 3, p-value = 0.502
    # alternative hypothesis: true mean is not equal to 35
    # 95 percent confidence interval:
    #   29.93251 38.11170
    # sample estimates:
    #   mean of x 
    # 34.0221 
    # 
    # 
    # [[2]]
    # 
    # One Sample t-test
    # 
    # data:  temp
    # t = -1.045, df = 3, p-value = 0.3728
    # alternative hypothesis: true mean is not equal to 35
    # 95 percent confidence interval:
    #   26.37007 39.36331
    # sample estimates:
    #   mean of x 
    # 32.86669 
    

    Another solution is to save t.test results per group as a dataframe:

    library(dplyr)
    library(tidyr)
    library(broom)
    
    sample_data %>%
      group_by(Location) %>%                       
      summarise(res = list(tidy(t.test(temp, mu=35)))) %>%
      unnest()
    
    # # A tibble: 2 x 9
    #   Location estimate statistic p.value parameter conf.low conf.high method            alternative
    #    <fct>       <dbl>     <dbl>   <dbl>     <dbl>    <dbl>     <dbl> <chr>             <chr>      
    # 1 Area_A       34.0    -0.761   0.502         3     29.9      38.1 One Sample t-test two.sided  
    # 2 Area_B       32.9    -1.05    0.373         3     26.4      39.4 One Sample t-test two.sided 
    

    The philosophy in both approaches is the same. You group by Location and you perform a t.test for each group. It's all about what kind of output you prefer to have.