I want to calculate the mean for each numeric variable in the following example. These need to be grouped by each factor associated with "id" and by each factor associated with"status".
set.seed(10)
dfex <-
data.frame(id=c("2","1","1","1","3","2","3"),status=c("hit","miss","miss","hit","miss","miss","miss"),var3=rnorm(7),var4=rnorm(7),var5=rnorm(7),var6=rnorm(7))
For the means of "id" groups, the first row of output would be labeled "mean-id-1". Rows labeled "mean-id-2" and "mean-id-3" would follow. For the means of "status" groups, the rows would be labeled "mean-status-miss" and "mean-status-hit". My objective is to generate these means and their row labels programatically.
I've tried many different permutations of apply functions, but each has issues. I've also experimented with the aggregate function.
With base R the following works for the "id" column:
means_id <- aggregate(dfex[,grep("var",names(dfex))],list(dfex$id),mean)
rownames(means_id) <- paste0("mean-id-",means_id$Group.1)
means_id$Group.1 <- NULL
Output:
var3 var4 var5 var6
mean-id-1 -0.7182503 -0.2604572 -0.3535823 -1.3530417
mean-id-2 0.2042702 -0.3009548 0.6121843 -1.4364211
mean-id-3 -0.4567655 0.8716131 0.1646053 -0.6229102
The same for the "status" column:
means_status <- aggregate(dfex[,grep("var",names(dfex))],list(dfex$status),mean)
rownames(means_status) <- paste0("mean-status-",means_status$Group.1)
means_status$Group.1 <- NULL