Search code examples
rlapplysapplytapply

use function on multiple columns (variables) in r


I am trying to run tests of homogeneity of variance using the leveneTest function from the car package. I can run the test on a single variable like so (using the iris dataset as an example)

library(car)
library(datasets)

data(iris)

leveneTest(iris$Sepal.Length, iris$Species)

However, I would like to run the test on all the dependent variables in the dataset simultaneously (so Sepal.Length, Sepal.Width, Petal.Length, Petal.Width). I am guessing it has something to do with the apply family of functions (sapply, lapply, tapply) but I just can't figure out how. The closest I came is something like this:

lapply(iris, leveneTest(group = iris$Species))

However I get the error

Error in leveneTest.default(group = iris$Species) : 
  argument "y" is missing, with no default

Which I understand is probably because it isn't able to specify the outcome variables. I am certain I must be missing some obvious use of the apply functions, but I just don't understand what it is. Apologies for the basic question, but I am relatively new to R and am often applying the same function to multiple variables (usually by copying the code several times), so it would be great to understand how to use these functions properly :)


Solution

  • Common parameters to the function need to be passed to ... within lapply. Like this:

    lapply(subset(iris, select = -Species), leveneTest, group = iris$Species)
    

    help("lapply") explains that ... is for "optional arguments to FUN" (meaning optional for lapply not for FUN) and provides lapply(x, quantile, probs = 1:3/4) as an example.