Search code examples
rchi-squaredr-factor

Return a string vector instead of integer vector when combining factors


I have a test data

test_data <- as.data.frame(list(
  Drugs = c(1, 2, 2, 2, 1, 2, 2, 3, 2, 2, 2, 2, 2, 2, 1, 3, 2, 1, 1, 2, 3, 3, 2, 3, 1, 2, 2, 2, 2, 2, 3, 3, 2, 2, 2, 2, 1, 1, 2, 1, 1, 2, 1, 1, 1, 2, 2, 2, 2, 1, 2, 2, 2, 2, 3, 1, 3, 1, 1, 2, 1, 2, 2, 1, 3, 1, 1, 3, 3, 2, 1, 3, 2, 2, 3, 1, 3, 2, 1, 2, 3, 1, 2, 3, 2, 1, 3, 1, 2, 3, 3, 2, 3, 2, 2, 2, 3, 1, 2, 2, 3),
  Result = c(1, 2, 1, 2, 1, 2, 1, 2, 1, 1, 2, 2, 1, 1, 2, 1, 2, 2, 1, 1, 2, 2, 1, 1, 1, 2, 2, 1, 1, 1, 1, 1, 2, 1, 1, 1, 2, 2, 1, 2, 2, 1, 2, 2, 2, 1, 1, 1, 2, 2, 1, 2, 1, 2, 2, 1, 2, 1, 1, 2, 2, 1, 1, 2, 1, 2, 2, 1, 2, 1, 1, 1, 2, 1, 2, 1, 1, 2, 1, 1, 2, 1, 1, 2, 1, 1, 1, 2, 2, 1, 2, 1, 2, 2, 1, 1, 1, 1, 2, 2, 1)
))

and I need to write a function that returns a string vector after calculating the max standardized residuals. The site says I must return a string vector while I return an integer vector. When I execute my function in my R, it returns a string vector, and I have no idea what is going on. Here is the function:

maxre <- function(x){
  x$Drugs <- factor(x$Drugs, labels = c('drug_1', 'drug_2','drug_3'))
  x$Result <- factor(x$Result, labels = c('negative', 'positive'))
  zz <- table(x)
  res <- chisq.test(zz)
  tab <- res$stdres
  tab <- as.data.frame(tab)
  tab$Drugs <- factor(tab$Drugs, labels = c('drug_1', 'drug_2','drug_3'))
  tab$Result <- factor(tab$Result, labels = c('negative', 'positive'))
  tab2 <- as.data.frame(tab)
  maximal <- max(tab2$Freq)
  interm <- tab[tab$Freq == maximal,]
  result <- c(interm[1, 1], interm[1, 2])
  result <- as.vector(result)
  return(result)
}

The answer is

[1] "drug_3"   "negative"

Solution

  • The problem results from the line:

    result <- c(interm[1, 1], interm[1, 2])
    

    where the first and second columns of interm are both factors.

    In the changelog of R 4.1.0:

    Using c() to combine a factor with other factors now gives a factor, an ordered factor when combining ordered factors with identical levels.

    • Version >= R 4.1.0
    c(factor('a'), factor('b'))
    # [1] a b
    # Levels: a b
    
    • Version < R 4.1.0: factor levels are stripped off before combining
    c(factor('a'), factor('b'))
    # [1] 1 1
    

    You can convert the factor columns into character before combining them with c().

    interm[1:2] <- lapply(interm[1:2], as.character)