Search code examples
rsummary

Creating a summary table based on range categories in r


My data is in the following format:

input<-data.frame(
    region=c("A","T","R","R","T"),
    geomorph=c("F","F","S","S","P"),
    depth=c(2.6,3.5,5.8,6.7,8.9))

> input
  region geomorph depth
1      A        F   2.6
2      T        F   3.5
3      R        S   5.8
4      R        S   6.7
5      T        P   8.9

I would like to create a summary table such that for the given depth categories (i.e 0-3,3-6,6-10) the number of entries for region (i.e A,R,T) and geomorphology (i.e. F,S,P) are counted and presented as follows:

output<-data.frame(
    depth.category=c("0-3","3-6","6-10"),
    total=c(1,2,2),
    A=c(1,0,0),
    R=c(0,1,1),
    T=c(0,1,1),
    F=c(1,1,0),
    S=c(0,1,1),
    P=c(0,0,1))

> output
  depth.category total A R T F S P
1            0-3     1 1 0 0 1 0 0
2            3-6     2 0 1 1 1 1 0
3           6-10     2 0 1 1 0 1 1

Any suggestions how to go about this?


Solution

  • First, just create your intervals using cut, and then use table and cbind the results:

    intervals <- cut(input$depth, breaks=c(0, 3, 6, 10))
    
    cbind(table(intervals),
          table(intervals, input$region),
          table(intervals, input$geomorph))
    #          A R T F P S
    # (0,3]  1 1 0 0 1 0 0
    # (3,6]  2 0 1 1 1 0 1
    # (6,10] 2 0 1 1 0 1 1
    

    The output of the above is a matrix. Use the following if you want a data.frame:

    temp <- cbind(table(intervals),
          table(intervals, input$region),
          table(intervals, input$geomorph))
    
    temp <- data.frame(depth.category = rownames(temp),
                       as.data.frame(temp, row.names = 1:nrow(temp)))
    names(temp)[2] <- "Total"
    temp
    #   depth.category Total A R T F P S
    # 1          (0,3]     1 1 0 0 1 0 0
    # 2          (3,6]     2 0 1 1 1 0 1
    # 3         (6,10]     2 0 1 1 0 1 1