I have a dataframe generated through a series of functions inserting numerical arrays. The df consists of 156 variables with 4261 observations each. I'm trying to find the mean per column, but the colMeans() functions gives the following error:
> colMeans(results)
Error in if (inherits(X[[j]], "data.frame") && ncol(xj) > 1L) X[[j]] <- as.matrix(X[[j]]) :
missing value where TRUE/FALSE needed
I think it has something to do with the structure of the dataframe, hence I tried to alter it, but that gave another error.
> str(results)
'data.frame': 4261 obs. of 156 variables:
$ r5.2.5 : num 0 0 0 0 0 0 0 0 0 0 ...
$ r10.2.5 :'data.frame': 4261 obs. of 1 variable:
..$ ret: num 0 0 0 0 0 0 0 0 0 0 ...
$ r20.2.5 :'data.frame': 4261 obs. of 1 variable:
..$ ret: num 0 0 0 0 0 0 0 0 0 0 ...
$ r30.2.5 :'data.frame': 4261 obs. of 1 variable:
..$ ret: num 0 0 0 0 0 0 0 0 0 0 ...
....
> results <- as.data.frame(as.numeric(results))
Error in as.data.frame(as.numeric(results)) :
(list) object cannot be coerced to type 'double'
> results <- data.matrix(results)
Error in data.matrix(results) :
(list) object cannot be coerced to type 'double'
I think one of the functions I use, creates dataframes and attaches those to the existing df, hence the 'data.frame' in the structure of the arrays.
Is there a way I can restructure the dataframe to one that can run functions like colMeans() and colSums()?
it seems like some of your columns are data frames themselves, you need to turn them back into vectors, here is how you do it
## get the columns in question
my_dfs <- sapply(results, function(x) is.data.frame(x))
## turn them into vectors
results[,my_dfs] <- sapply(results[,my_dfs], function(x) unlist(x))
### then you can do
my_means <- sapply(results, mean)