Search code examples
rmatrixmulti-dimensional-scaling

How to display plant species biomass in a site by species matrix?


I earlier asked "How to display two columns as binary (presence/absence) matrix?". This question received two excellent answers. I would now like to take this a step further and add a third column to the original site by species columns which reflects the biomass of each species in each plot.

Column 1 (plot) specifies code for ~ 200 plots, column 2 (species) specifies code for ~ 1200 species and Column 3 (biomass) specifies the dryweight. Each plot has > 1 species and each species can occur in > 1 plot. The total number of rows is ~ 2700.

> head(dissim)
    plot species biomass
1 a1f56r  jactom 20.2
2 a1f56r  zinunk 10.3
3 a1f56r  mikcor 0.4
4 a1f56r  rubcle 1.3
5 a1f56r  sphoos 12.4
6 a1f56r nepbis1 8.2

tail(dissim)
           plot species biomass
2707 og100m562r  selcup 4.7
2708 og100m562r  pip139 30.5
2709 og100m562r  stasum 0.1
2710 og100m562r  artani 3.4
2711 og100m562r  annunk 20.7
2712 og100m562r  rubunk 22.6

I would like to create a plot by species matrix that displays the biomass of each species in each plot (rather than a binary presence/absence matrix), something of the form:

    jactom  rubcle  chrodo  uncgla
a1f56r  1.3 0   10.3    0
a1f17r  0   22.3    0   4
a1m5r   3.2 0   3.7 9.7
a1m5r   1   0   0   20.1
a1m17r  5.4 6.9 0   1

Any advice on how to add this additional level of complexity would be very much appreciated.


Solution

  • The xtabs and tapply functions return a table which is a matrix:

    # Using MrFlick's example
    > xtabs(~a+b,dd)
       b
    a   f g h i j
      a 0 1 0 2 3
      b 0 0 2 1 0
      c 0 3 0 0 1
      d 2 2 2 1 1
      e 1 1 2 4 1
    
    # --- the tapply solution is a bit less elegant
    > dd$one=1
    > with(dd, tapply(one, list(a,b), sum))
       f  g  h  i  j
    a NA  1 NA  2  3
    b NA NA  2  1 NA
    c NA  3 NA NA  1
    d  2  2  2  1  1
    e  1  1  2  4  1
    
    # If you want to make the NA's become zeros then:
    
    > tbl <- with(dd, tapply(one, list(a,b), sum))
    > tbl[is.na(tbl)] <- 0
    > tbl
      f g h i j
    a 0 1 0 2 3
    b 0 0 2 1 0
    c 0 3 0 0 1
    d 2 2 2 1 1
    e 1 1 2 4 1