Search code examples
rdataframedplyranovarstatix

How do I perform calculations using the columns of a data table from an ANOVA using the dplyr and rstatix packages?


I have a data frame and have done an ANOVA between the data. After the ANOVA I want to use one of the resulting columns to do a calculation and create a new column with the mutate() function. However, an error appears indicating that this operation cannot be done on an anova class object:

Error: `x` must be a vector, not a <anova_test/data.frame/rstatix_test> object.

Can someone help me perform calculations (F + 1) with the F column of the ANOVA result?

enter image description here

library(dplyr)
library(rstatix)

Temperature <- factor(c(rep("cold", times = 4),
                        rep("hot", times = 4)),
                      levels = c("cold", "hot"))

Light <- factor(rep(c(rep("blue", times = 2),
                      rep("yellow", times = 2)),
                    times = 2),
                levels = c("blue", "yellow"))

Result <- c(90.40, 85.20, 21.70, 25.30,
            75.12, 77.36, 6.11, 10.8)

Data <- data.frame(Temperature, Light, Result)

NewColumn <- Data %>%
  anova_test(formula = Result ~ Temperature*Light) %>%
  mutate(New= `F` + 1) #<-------- Not working

Solution

  • As mentioned by JKupzig in the comments, this is a known issue in dplyr as documented here: https://github.com/tidyverse/dplyr/issues/5286.

    The issue is caused by anova_test() creating an output data frame with classes anova_test, data.frame and rstatix_test, in that order, while mutate() from dplyr seems to get hung up if the last element in the class vector is not data.frame. You can verify the classes of the output of the anova as follows:

    Data %>% anova_test(formula = Result ~ Temperature*Light) %>% class()
    
    [1] "anova_test"   "data.frame"   "rstatix_test"
    

    As a workaround, you can add as_tibble() to your dplyr pipe after anova_test(). This resets the classes to tbl_df, tbl, and data.frame, in that order.

    Data %>% anova_test(formula = Result ~ Temperature*Light) %>% as_tibble() %>% class()
    
    [1] "tbl_df"     "tbl"        "data.frame"
    
    Data %>%
        anova_test(formula = Result ~ Temperature*Light) %>% 
        as_tibble() %>%
        mutate(New= `F` + 1)
    
    # A tibble: 3 x 8
      Effect              DFn   DFd        F         p `p<.05`   ges     New
      <chr>             <dbl> <dbl>    <dbl>     <dbl> <chr>   <dbl>   <dbl>
    1 Temperature           1     4   42.2   0.003     "*"     0.914   43.2 
    2 Light                 1     4 1041.    0.0000055 "*"     0.996 1042.  
    3 Temperature:Light     1     4    0.725 0.442     ""      0.153    1.72
    

    Note, that this action removes the classes anova_test and rstatix_test. If these classes are important down the line, use a different workaround with set_class() from the magrittr package (magrittr is a dependency of dplyr, so no need to install it separately).

    Data %>%
        anova_test(formula = Result ~ Temperature*Light) %>%
        magrittr::set_class(c("anova_test", "rstatix_test", "data.frame")) %>% 
        class()
    
    [1] "anova_test"   "rstatix_test" "data.frame" 
    
    Data %>%
       anova_test(formula = Result ~ Temperature*Light) %>%
       magrittr::set_class(c("anova_test", "rstatix_test", "data.frame")) %>% 
       mutate(New = `F` + 1)
    
    ANOVA Table (type II tests)
    
                 Effect DFn DFd        F        p p<.05   ges      New
    1       Temperature   1   4   42.250 3.00e-03     * 0.914   43.250
    2             Light   1   4 1041.366 5.50e-06     * 0.996 1042.366
    3 Temperature:Light   1   4    0.725 4.42e-01       0.153    1.725