I have a dataset of long format with 3 factors (strain
, genotype
, region
) and 1 value (volume
). This dataset is called individualData
. Basically what I'm trying to do is calculate the mean and standard deviation of volume for every combination of strain * genotype * region
, with the exception of those combinations without any data (since genotype labels depend on the strain). It seems like I've been able to do this with the following command, since it produces the expected number of rows:
summaryData = aggregate( .~strain:genotype:region, individualData, FUN = function(x) c(mn=mean(x), stdev=sd(x)))
The problem is that head(summaryData)
gives me 5 columns (volume
is replaced with volume.mn
and volume.stdev
), as I would have expected, but names(summaryData)
or colnames(summaryData)
gives me only 4 columns -- namely, my original columns. How do I refer to the columns properly? I just want to collapse this into a data.frame
that I understand how to work with. Anyone with more experience with the aggregate
function know how to do this?
Thanks!
First, here's some reproducible sample data which i'm assuming matches your structure
set.seed(15)
individualData <- data.frame(
volume = runif(120),
expand.grid(region=1:2, genotype=1:3, strain=1:2)
)
Then you're running
summaryData = aggregate( .~strain:genotype:region, individualData,
FUN = function(x) c(mn=mean(x), stdev=sd(x)))
and if you look at the structure of what's returned, you get
str(summaryData)
# 'data.frame': 12 obs. of 4 variables:
# $ strain : int 1 2 1 2 1 2 1 2 1 2 ...
# $ genotype: int 1 1 2 2 3 3 1 1 2 2 ...
# $ region : int 1 1 1 1 1 1 2 2 2 2 ...
# $ volume : num [1:12, 1:2] 0.526 0.409 0.407 0.445 0.566 ...
# ..- attr(*, "dimnames")=List of 2
# .. ..$ : NULL
# .. ..$ : chr "mn" "stdev"
so aggregate
has actually stuffed a matrix into the volume
column. You can index these values with
summaryData$volume[,"mn"]
summaryData$volume[,"stdev"]
or turn it into a proper data.frame with
summaryData <- do.call(data.frame, summaryData)
summaryData$volume.mn
summaryData$volume.stdev