Search code examples
rdplyrtidyversedata-wrangling

R: Transform data, like Excel Pivot table


I am struggling to transform my data to the below output. The output values are the sum of count.

Suspect this will involve a combination of group_by(),pivot_wider(), and summarise(), but haven't had luck with that.

Data:

df1 <- rbind(animal.boolean = c("Y","Y","Y","Y","Y","Y","N"),
             animal.detailed = c("cat", "mouse", "dog", "mouse", "cat", "dog", "cupboard"), 
             count = c(2,1,4,1,8,1,1))

Output:

df2 <- rbind(c(10,2,5,0), c(0,0,0,1))
colnames(df) <- c("cat", "mouse", "dog", "cupboard")
rownames(df) <- c("Y","N")

Solution

  • That matrix is truly bad input format. We can transpose it and turn it to a data frame, and then use pivot_wider's value_fn argument to skip the need for a separate group_by() %>% summarize() step.

    library(tidyr)
    library(readr)
    t(df1) %>%
      as.data.frame() %>%
      type_convert() %>%
      pivot_wider(
        names_from = animal.detailed,
        values_from = count,
        values_fill = 0,
        values_fn = sum
    )
    # # A tibble: 2 × 5
    #   animal.boolean   cat cupboard   dog mouse
    #   <chr>          <dbl>    <dbl> <dbl> <dbl>
    # 1 Y                 10        0     5     2
    # 2 N                  0        1     0     0