Today, I used the count()
with group_by()
and summarise()
from the dplyr package. Here is the code:
dplyr::group_by(QUARTER, PROVINCE, DATE) %>%
dplyr::summarise(COUNT = count(EDUCATION)) %>%
ungroup()
Here is what my df
looks like after executing the code above:
Quarter PROVINCE DATE COUNT.x COUNT.freq
2020Q1 New-York 2020-01-07 1 30
2021Q1 New-York 2020-01-07 2 21
2020Q1 New-York 2020-01-07 3 55
2021Q1 New-York 2020-01-07 4 8
Note: I did not create COUNT.x
and COUNT.freq
myself! Now, I am trying to take the COUNT.x
as columns and COUNT.freq
as the values. This means pivot_wider()
would be the solution here. However, after trying the following:
pivot_wider(df, names_from = COUNT.x, values_from = COUNT.freq)
It says COUNT.x
and COUNT.freq
do not exist:
Error: Can't subset columns that don't exist.
x Column `COUNT.x` doesn't exist
Also, checking names(df)
gives me:
[1] "QUARTER" "PROVINCE" "DATE" "COUNT"
It does not include COUNT.x
and COUNT.freq
!
Anyone have a solution to this problem?
The issue is that count
is being applied within summarise
resulting in a data.frame/tibble column. It should be outside i.e.
dplyr::count(df, QUARTER, PROVINCE, DATE, EDUCATION)
If we are using the OP's code, it is not a regular data.frame column. We need to reinvoke data.frame
to flatten the columns
out <- df %>% dplyr::group_by(QUARTER, PROVINCE, DATE) %>%
dplyr::summarise(COUNT = count(EDUCATION)) %>%
ungroup()
out1 <- do.call(data.frame, out)