I have a list of 52 datasets and I am trying to get column sums for a specified number of columns from each dataset and export it to a new dataframe. I know I want to sum everything in column 9 and afterwards but the total number of columns varies between each dataset. ("locs" is my list of dataframes)
Here is what I have tried using a for loop:
summaryofsums <- vector("list",1) #empty vector
for (df in 1:length(locs)){
newdf <- df[, colSums(df!= 0) > 0] #get rid of all columns that have only 0s
newdfsum <- colSums(newdf[,9:length(newdf)])
summaryofsums[i] <- newdfsum
}
I receive the following error:
Error in colSums(df != 0) :
'x' must be an array of at least two dimensions
version _
platform x86_64-apple-darwin15.6.0
arch x86_64
os darwin15.6.0
system x86_64, darwin15.6.0
status
major 3
minor 5.3
year 2019
month 03
day 11
svn rev 76217
language R
version.string R version 3.5.3 (2019-03-11) nickname Great Truth
Thank you!!
Using sapply
:
sapply(locs, function(df) {
newdf <- df[, colSums(df!= 0, na.rm = TRUE) > 0]
colSums(newdf[,9:ncol(newdf)], na.rm = TRUE)
}) -> result
result