Search code examples
rdplyrwide-format-data

Turning data to wide-form


I have data in this format:

ID risk severity
1 A green
1 A amber
1 A amber
1 B amber
1 B amber
1 B amber
2 B green
2 B green
2 B amber
2 C green
2 C red
2 C red

I want to convert it to this format using dplyr on R-Studio:

ID A min_sev max_sev B min_sev max_sev C min_sev max_sev
1 3 green amber 3 amber amber 0 NA NA
2 0 NA NA 3 green amber C green red

where,

min_sev = minimum severity

max_sev = maximum severity

Any help is appreciated, thanks in advance.


Solution

  • Good catch @r2evans, here is an even simpler solution with nest and unnest:

    library(dplyr)
    library(tidyr)
    
    df %>%
      group_by(ID, risk) %>%
      summarise(min_sev = first(severity),
                max_sev = last(severity),
                count = n()) %>%
      group_by(ID, risk) %>%
      nest() %>%
      pivot_wider(names_from = risk, values_from = data) %>%
      unnest(cols = everything(), names_sep = "_")