Search code examples
rgroup-bysummarize

Group, summarize and transpose


I have a dataframe that looks like this:

ctgroup (dataframe)

Camera Trap Name  Animal Name         a_sum 
 1  CAM27             Chicken             1
 2  CAM27             Dog                 1
 3  CAM27             Dog                 4
 4  CAM28             Cat                 3
 5  CAM28             Dog                 22
 6  CAM28             Dog                 1

*a_sum = No. of animals recorded in a camera

So essentially I want to - Group by 2 fields(Camera Trap Name, Scientific Name) and then Count the number of record in the column "a_sum", and transpose the data so that Animal. Name becomes column and Camera Trap Name my rows. I want to display all the animal names in columns, with 0 if no data available i.e.,

Camera trap name        Dog   Cat   Wolf   Chicken
   CAM28                 23     4     1      4
   CAM27                 5      0     0      4

I tried using the following code

dcast (ctgroup, Camera.Trap.name + Animal.name, value.var  = "a_sum")

And I got the following error:

In dcast(ctgroup, Camera.Trap.name + Scientific.name, value.var = "a_sum") :
  The dcast generic in data.table has been passed a grouped_df and will attempt to redirect to the reshape2::dcast; please note that reshape2 is deprecated, and this redirection is now deprecated as well. Please do this redirection yourself like reshape2::dcast(ctgroup). In the next version, this warning will become an error.

I don't think I know enough to construct the correct code for carrying out this work.


Solution

  • The dplyr approach:

    library(dplyr)
    library(tidyr)
    
    ctgroup %>%
      group_by(Camera, Animal) %>%
      summarize(a_sum = sum(a_sum)) %>%
      pivot_wider(id_cols = Camera, names_from = Animal, values_from = a_sum, values_fill = list(a_sum = 0))