I have a matrix derived from a table with three original columns: column 1 = site codes, column 2 = species codes and column 3 = biomass weight for each species. The biomass weight of each species in each plot is displayed in the matrix. The matrix can be calculated with one of the three following options (thanks to feedback on an earlier question):
reshape::cast(dissimBiom, plot ~ species, value = 'biomass', fun = mean)
by(dissimBiom, dissimBiom$biomass, function(x) with(x, table(plot, species)))
tapply(dissimBiom$biomass,list(dissimBiom$plot,dissimBiom$species),mean)
Note: dissim is the .csv file name for two-column table; dissimBiom is the .csv file name for three-column table.
I now would like to generate a dissimilarity matrix based on the above matrix. The below code requires packages vegan and ecodist.
I had earlier used the function
matrix <- with(dissim, table(plot,species))
to generate a matrix based on two columns (site vs species) only and then used
matrix.meta <- metaMDS(matrix, k=2, distance = "bray", trymax=10)
to generate a dissimilarity matrix. This worked just fine.
In contrast, attempts to generate a dissimilarity matrix where the matrix has been generated with one of the following codes (as above)
reshape::cast(dissimBiom, plot ~ species, value = 'biomass', fun = mean)
by(dissimBiom, dissimBiom$biomass, function(x) with(x, table(plot, species)))
tapply(dissimBiom$biomass,list(dissimBiom$plot,dissimBiom$species),mean
using the same function
matrixBiom.meta <- metaMDS(matrixBiom, k=2, distance = "bray", trymax=10)
results in the following error message
Error in if (any(autotransform, noshare > 0, wascores) && any(comm < 0)) { :
missing value where TRUE/FALSE needed
Note: I call matrixBiom from the file matrixBiom.csv which I wrote to convert the NA's to 0, using
write.csv(matrixBiom, "matrixBiom.csv", na="0",row.names=TRUE)
In contrast to matrixBiom.meta, matrix.meta was directly used on 'matrix' without writing a .csv file.
Also, the matrix generated by
matrix <- with(dissim, table(plot,species))
looks like this,
species
plot xanfla1 xangria xanret
a100f177r 1.4 0 8.9
a100f562r 0 5.6 0
a100f56r 22.4 0 1.3
while the matrix generated by either of the other approaches has the format
zinunk ziz150 zizang
a100f177r 22.4 NA 2.6
a100f562r 1.3 NA NA
a100f56r NA 3.1 NA
a100f5r NA NA 0.2
My questions would be,
1) In either of these functions
reshape::cast(dissimBiom, plot ~ species, value = 'biomass', fun = mean)
by(dissimBiom, dissimBiom$biomass, function(x) with(x, table(plot, species)))
tapply(dissimBiom$biomass,list(dissimBiom$plot,dissimBiom$species),mean
can NAs directly be converted to 0s to avoid writing and reading in a .csv file, maybe this would solve the problem?
2) What fixes could be used for the three-column table example to conduct an NMDS using metaMDS?
3) Is there alternative functions to calculate a dissimilarity matrix for the three-column table example?
Any advice would be very much appreciated.
Please find a reproducible data subset below:
> dput(dataframe)
structure(list(plot = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 3L, 3L, 3L, 3L), .Label = c("a1f17r",
"a1f56r", "a1m17r", "a1m5r"), class = "factor"), species = structure(c(12L,
29L, 16L, 21L, 24L, 19L, 6L, 13L, 14L, 5L, 16L, 12L, 26L, 9L,
29L, 28L, 17L, 15L, 25L, 6L, 3L, 8L, 27L, 6L, 1L, 7L, 18L, 10L,
12L, 11L, 2L, 20L, 13L, 27L, 22L, 23L, 4L, 1L), .Label = c("annunk",
"blurip", "cae089", "caepar", "chrodo", "clihir", "dalpin", "derele",
"embphi", "ficmeg", "indunk", "jactom", "leeind", "merbor", "mergra",
"mikcor", "nep127", "nepbis", "nepbis1", "palunk", "rubcle",
"sinirp", "spagyr1", "sphoos", "stitrut", "tetped", "tinpet",
"uncgla", "zinunk"), class = "factor"), biomass = c(100.6, 284.6,
13.8, 2.8, 1, 3.1, 8.8, 0.5, 15.2, 13.8, 6.1, 5.3, 18.8, 4.1,
199, 68, 143.3, 11.3, 6.5, 0.2, 54.1, 39, 22, 1.2, 6.3, 6, 0.1,
2.8, 42, 1.9, 0.1, 0.2, 0.2, 0.1, 2.1, 4.3, 0.7, 0.2)), .Names = c("plot",
"species", "biomass"), class = "data.frame", row.names = c(NA,
-38L))
Not easily, so do it in a secondary step. I find the tapply()
result neater so I'll go with that: (assuming your example data is in dat
)
dat2 <- as.data.frame(with(dat, tapply(biomass, list(plot, species), mean)))
giving
> dat2[, 1:6]
annunk blurip cae089 caepar chrodo clihir
a1f17r NA NA NA NA NA 0.2
a1f56r NA NA NA NA 13.8 8.8
a1m17r 0.2 NA NA 0.7 NA NA
a1m5r 6.3 0.1 54.1 NA NA 1.2
Then to convert NA
to 0
we do
dat2[is.na(dat2)] <- 0
which gives us
> dat2[, 1:6]
annunk blurip cae089 caepar chrodo clihir
a1f17r 0.0 0.0 0.0 0.0 0.0 0.2
a1f56r 0.0 0.0 0.0 0.0 13.8 8.8
a1m17r 0.2 0.0 0.0 0.7 0.0 0.0
a1m5r 6.3 0.1 54.1 0.0 0.0 1.2
Given the solution to Q1, there are no further steps required.
Follow the solution in Question 1 above and then run dist()
or vegdist()
or some other function that can compute dissimilarity matrices from data frame objects.