Add rows to complete melted tibble

Consider this "melted" tibble:

structure(list(country = c("Canada", "US", "Canada"), gender = c("male", "male", "female"), count = c(1, 5, 3)), row.names = c(NA, -3L ), class = c("tbl_df", "tbl", "data.frame"))

country	gender	count
Canada	male	1
US	male	5
Canada	female	3

So Canada has a count for males and a count for females, while the US only has a count for males. This implicitly means that US x female count is zero. Since it's 2 countries times 2 genders, the number of rows should be 4, but the row with zero count is omitted. Is there a way to add a row US-female-0 to "complete" this tibble?

A simple way to do it is to reshape to wide and then to long again:

df %>%
    pivot_wider(names_from = gender, values_from = count) %>% 
    mutate(female = ifelse(is.na(female), 0, female)) %>% 
    reshape2::melt("country") %>% 
    tibble

But this is long to write, it may be slow, and is complex if the widened data is bigger than this.

Another possibility, which seems smarter, is:

cross_join(df %>% distinct(country), df %>% distinct(gender)) %>%
    left_join(df) %>%
    mutate(count = ifelse(is.na(count), 0, count))

But this is not easy to insert in the middle of a pipe, for example in case df itself comes from a pipe, without breaking into multiple operations or using global assignment.

Is there a simple way to do this "completion" of the data that is reasonably fast, scalable, and works inside a pipe?

Solution

tidyr::complete(df, country, gender, fill = list(count = 0))

# A tibble: 4 × 3
  country gender count
  <chr>   <chr>  <dbl>
1 Canada  female     3
2 Canada  male       1
3 US      female     0
4 US      male       5