Search code examples
rmissing-databayesian

Is there a way to handle missing values with ttestBF() in R?


ttestBF() from the BayesFactor package throws the following error when trying to analyse data with missing values:

> ttestBF(formula = Level2 ~ Completers, r = sqrt(2)/2, data = tib)
Error in checkFormula(formula, data, analysis = "indept") : 
  Dependent variable must not contain missing or infinite values.

Despite being mentioned in its documentation, adding argument is.na to the ttestBF() command does not change anything (but perhaps it's my fault because I don't know how to apply it):

> ttestBF(formula = Level2 ~ Completers, r = sqrt(2)/2, is.na = TRUE, data = tib)
Error in checkFormula(formula, data, analysis = "indept") : 
  Dependent variable must not contain missing or infinite values.
In addition: Warning message:
data coerced from tibble to data frame 

My questions:

  • Is there a way to handle missing values with ttestBF() from the BayesFactor package?
  • If yes, how?

Reproducible data:

nn <- 100 # total sample size (both Completers and Non Completers)

# create tibble with 2 level samples (completers & non completers), 50 each
tib <- tibble::tibble(Completers = rep(c(1, 0), each = nn/2),
                      Level = c(sample(1:9, 50, replace = TRUE,
                                             prob = c(57/1043*100, 275/1043*100,
                                                      398/1043*100, 199/1043*100,
                                                      72/1043*100, 27/1043*100,
                                                      12/1043*100, 2/1043*100,
                                                      1/1043*100)),
                                      sample(1:9, 50, replace = TRUE,
                                             prob = c(57/1043*100, 275/1043*100,
                                                      398/1043*100, 199/1043*100,
                                                      72/1043*100, 27/1043*100,
                                                      12/1043*100, 2/1043*100,
                                                      1/1043*100))))

# Value labels
expss::val_lab(tib$Completers) = expss::num_lab("
                                 1 Completers
                                 0 Non Completers")


expss::val_lab(tib$Level) = expss::num_lab("1 A
                                            2 B
                                            3 C
                                            4 D
                                            5 E
                                            6 F
                                            7 G
                                            8 H
                                            9 I")

# Copy 2nd column into 3rd and call it Level2
tib$Level2 <- tib$Level

# Insert single missing value (NA)
tib[3, 2] = NA

Output with no missing values:

> BayesFactor::ttestBF(formula = Level ~ Completers, r = sqrt(2)/2, data = tib)
Bayes factor analysis
--------------
[1] Alt., r=0.707 : 0.3507102 ±0.02%

Against denominator:
  Null, mu1-mu2 = 0 
---
Bayes factor type: BFindepSample, JZS

Warning message:
data coerced from tibble to data frame 

Output with missing values:

> BayesFactor::ttestBF(formula = Level2 ~ Completers, r = sqrt(2)/2, data = tib)
Error in checkFormula(formula, data, analysis = "indept") : 
  Dependent variable must not contain missing or infinite values.
In addition: Warning message:
data coerced from tibble to data frame

Output with missing values and argument is.na = TRUE:

> BayesFactor::ttestBF(formula = Level2 ~ Completers, r = sqrt(2)/2, is.na = TRUE, data = tib)
Error in checkFormula(formula, data, analysis = "indept") : 
  Dependent variable must not contain missing or infinite values.
In addition: Warning message:
data coerced from tibble to data frame

Sources:


sessioninfo::session_info() extract:

 setting  value
 version  R version 4.2.1 (2022-06-23)
 os       macOS Monterey 12.5.1
 rstudio  2022.07.1+554 Spotted Wakerobin (desktop)

 package      * version    date (UTC) lib source
 BayesFactor  * 0.9.12-4.4 2022-07-05 [1] CRAN (R 4.2.0)
 expss        * 0.11.1     2022-01-07 [1] CRAN (R 4.2.0)
 tibble       * 3.1.8      2022-07-22 [1] CRAN (R 4.2.0)

Solution

  • You can remove NA from your data:

    BayesFactor::ttestBF(formula = Level2 ~ Completers, r = sqrt(2)/2, data = filter(tib, !is.na(Level2) & !is.na(Completers))