How to pivot_wider this small dataset? I want to put the race as column headers with each Count but for an unknown reason it puts all the Counts into a list. Also getting the warning below:
library(tidyverse)
a <- structure(list(Count = c(1, 1, 3, 1, 2, 1, 2, 1, 3, 1, 1, 2,
2, 1, 3, 3, 3, 5, 3, 3), race = c("L", "F", "W", "F", "F", "LF",
"F", "F", "F", "F", "F", "F", "F", "S", "F", "F", "F", "F", "F",
"F"), year = c("2012", "2013", "2013", "2013", "2013", "2013",
"2013", "2012", "2013", "2013", "2012", "2013", "2013", "2013",
"2013", "2013", "2013", "2013", "2013", "2013")), row.names = c(NA,
20L), class = "data.frame")
a
Count race year
1 1 L 2012
2 1 F 2013
3 3 W 2013
4 1 F 2013
5 2 F 2013
6 1 LF 2013
7 2 F 2013
8 1 F 2012
9 3 F 2013
10 1 F 2013
11 1 F 2012
12 2 F 2013
13 2 F 2013
14 1 S 2013
15 3 F 2013
16 3 F 2013
17 3 F 2013
18 5 F 2013
19 3 F 2013
20 3 F 2013
a %>% pivot_wider(names_from = race, values_from = Count)
# A tibble: 2 x 6
year L F W LF S
<chr> <list> <list> <list> <list> <list>
1 2012 <dbl [1]> <dbl [2]> <NULL> <NULL> <NULL>
2 2013 <NULL> <dbl [14]> <dbl [1]> <dbl [1]> <dbl [1]>
Warning message:
Values from `Count` are not uniquely identified; output will contain list-cols.
* Use `values_fn = list` to suppress this warning.
* Use `values_fn = {summary_fun}` to summarise duplicates.
* Use the following dplyr code to identify duplicates.
{data} %>%
dplyr::group_by(year, race) %>%
dplyr::summarise(n = dplyr::n(), .groups = "drop") %>%
dplyr::filter(n > 1L)
EDITED: Expected output
year L F W LF S
2012 1 1 0 0 0
2012 0 1 0 0 0
2013 0 1 3 1 0
2013 0 1 0 0 0
2013 ... ... ... ... ...
2013 ... ... ... ... ...
2013 ... ... ... ... ...
2013 0 3 0 0 0
2013 ... ... ... ... ...
If I understand, you want something like below. Since you don't want the multiple observations for each race
-year
pair to be combined or aggregated in any way, you can make a new variable (I called it obs
) that is the observation number within race
-year
pair and thus this variable will keep those values separate rather than trying to combine or aggregate them.
library(dplyr)
library(tidyr)
a <- structure(list(Count = c(1, 1, 3, 1, 2, 1, 2, 1, 3, 1, 1, 2,
2, 1, 3, 3, 3, 5, 3, 3), race = c("L", "F", "W", "F", "F", "LF",
"F", "F", "F", "F", "F", "F", "F", "S", "F", "F", "F", "F", "F",
"F"), year = c("2012", "2013", "2013", "2013", "2013", "2013",
"2013", "2012", "2013", "2013", "2012", "2013", "2013", "2013",
"2013", "2013", "2013", "2013", "2013", "2013")), row.names = c(NA,
20L), class = "data.frame")
a %>%
group_by(race, year) %>%
mutate(obs = row_number()) %>%
pivot_wider(names_from = "race", values_from="Count", values_fill=0) %>%
arrange(year, obs) %>%
select(-obs)
#> # A tibble: 16 × 6
#> # Groups: year [2]
#> year L F W LF S
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 2012 1 1 0 0 0
#> 2 2012 0 1 0 0 0
#> 3 2013 0 1 3 1 1
#> 4 2013 0 1 0 0 0
#> 5 2013 0 2 0 0 0
#> 6 2013 0 2 0 0 0
#> 7 2013 0 3 0 0 0
#> 8 2013 0 1 0 0 0
#> 9 2013 0 2 0 0 0
#> 10 2013 0 2 0 0 0
#> 11 2013 0 3 0 0 0
#> 12 2013 0 3 0 0 0
#> 13 2013 0 3 0 0 0
#> 14 2013 0 5 0 0 0
#> 15 2013 0 3 0 0 0
#> 16 2013 0 3 0 0 0
Created on 2023-09-03 with reprex v2.0.2