Search code examples
rloopsvectormeansapply

How to get the mean of specific columns in dataframe and store in vector (in R)


I want to get the mean of specific columns in a dataframe and store those means in a vector in R.

The specific variable names of the columns are stored in a vector. For those specific variables (depends on user input) I want to calculate the mean and store those in a vector, over which I can loop then to use it in another part of my code.

I tried as follows, e.g.:

specific_variables <- c("variable1", "variable2")  # can be of a different length depending on user input
data <- data.frame(...)  # this is a dataframe with multiple columns, of which "variable1" and "variable2" are both columns from
mean_xm <- 0  # empty variable for storage purposes

# for loop over the variables
for (i in length(specific_variables)) {
  mean_xm[i] <- mean(data$specific_variables[i], na.rm = TRUE)
}

print(mean_xm)

I get an error saying Error: object of type 'closure' is not subsettable

Second attempt using sapply:

colMeans(data[sapply(data, is.numeric)])

But this gives me the means of all columns of the dataframe, but I only want to get those from the columns specified in specific_variables. Ideally, I'd like to store those means into a vector as I did in my first attempt.


Solution

  • We may use

    v1 <- unname(colMeans(data[specific_variables], na.rm = TRUE))