Search code examples
rlistloopst-test

Loop a t-test through a list of data frames


I have a load of survey data that I need to run a t-test through. It looks something like this (but not much like this, a dolphin is unlikely to be 52mm):

Area                    Season  Species Length (mm)
Christchurch            Spring  dolphin 52
Christchurch            Spring  dolphin 54
Christchurch            Spring  dolphin 46
Christchurch            Spring  dolphin 40
Christchurch            Spring  dolphin 38
Christchurch            Autumn  dolphin 52
Christchurch            Autumn  dolphin 54
Christchurch            Autumn  dolphin 46
Christchurch            Autumn  dolphin 40
Christchurch            Autumn  dolphin 38
Christchurch            Spring  ray     52
Christchurch            Spring  ray     54
Christchurch            Spring  ray     46
Christchurch            Spring  ray     40
Christchurch            Spring  ray     38
Christchurch            Autumn  ray     52
Christchurch            Autumn  ray     54
Christchurch            Autumn  ray     46
Christchurch            Autumn  ray     40
Christchurch            Autumn  ray     38

My problem is I have a range of species and about 2000 measurements and I need to run a paired t-test for each species between each season. I am very new to r and coding in general so any help is appreciated in making this process more efficient as I am fully aware I have probably not gone about this the most streamlined way.

I'd like to be able to loop the t-test through somehow and get a nice understandable output and be able to apply the script to other locations easily (I have 6).

I have split the large data frame down to species and removed the empty data frames from the list

list_df<-split(ld22,ld22$SPECIES_NAME)
list_df<-list_df[sapply(list_df, nrow) > 0]

I then tried this, which I found by googling the problem:

p <-list()
for (i in 1:length(list_df)) {
  p[[i]] <- pairwise.t.test(list_df[[i]]$TOTAL_LENGTH_MM, list_df[[i]]$SURVEY_TYPE, p.adjust = "none")
}
p

There are no error codes but I don't get any results and I have no idea where to go next. Any help would be much appreciated.


Solution

  • Everything in one go using purrr:

    library(purrr)
    library(dplyr)
    ld22  |> 
      group_split(Species) |> 
      setNames(unique(ld22 $Species)) |> 
      keep(~length(.x) > 0) |> 
      imap(~pairwise.t.test(x = .x$Length, g = .x$Season,p.adjust = "none") |> 
             broom::tidy() |> 
             mutate(species = .y))
    

    Output:

    $dolphin
    # A tibble: 1 x 4
      group1 group2 p.value species
      <chr>  <chr>    <dbl> <chr>  
    1 Spring Autumn       1 dolphin
    
    $ray
    # A tibble: 1 x 4
      group1 group2 p.value species
      <chr>  <chr>    <dbl> <chr>  
    1 Spring Autumn       1 ray