I have a dataset ('DF1') that has count data. It looks like this:
Species | Date | Site | n |
---|---|---|---|
AMCR | 6/1/2021 | SVC | 14 |
AMCR | 6/1/2021 | BMA | 1 |
AMCR | 6/7/2021 | SVA | 2 |
AMCR | 6/15/2021 | SVA | 9 |
AMCR | 6/21/2021 | SVA | 18 |
AMCR | 6/29/2021 | SVA | 18 |
However, my study actually has nine 'Sites' (SVC, BMA, SVA, BMC, TMA, TMC, SRA, SRC, and MCC) and each site has data collected on the same five dates (6/1/2021, 6/8/2021, 6/15/2021, 6/21/2021, and 6/29/2021). DF1 only shows rows for where there were counts in 'n', but if there were no counts, I want the dataframe to be populated with zero counts for each of those dates for each site so that it looks like this:
Species | Date | Site | n |
---|---|---|---|
AMCR | 6/1/2021 | SVC | 14 |
AMCR | 6/7/2021 | SVC | 0 |
AMCR | 6/15/2021 | SVC | 0 |
AMCR | 6/21/2021 | SVC | 0 |
AMCR | 6/29/2021 | SVC | 0 |
AMCR | 6/1/2021 | BMA | 1 |
AMCR | 6/7/2021 | BMA | 0 |
AMCR | 6/15/2021 | BMA | 0 |
AMCR | 6/21/2021 | BMA | 0 |
AMCR | 6/29/2021 | BMA | 0 |
AMCR | 6/1/2021 | SVA | 0 |
AMCR | 6/7/2021 | SVA | 2 |
AMCR | 6/15/2021 | SVA | 9 |
AMCR | 6/21/2021 | SVA | 18 |
AMCR | 6/29/2021 | SVA | 18 |
AMCR | 6/1/2021 | BMC | 0 |
AMCR | 6/7/2021 | BMC | 0 |
AMCR | 6/15/2021 | BMC | 0 |
AMCR | 6/21/2021 | BMC | 0 |
AMCR | 6/29/2021 | BMC | 0 |
AMCR | 6/1/2021 | TMA | 0 |
AMCR | 6/7/2021 | TMA | 0 |
AMCR | 6/15/2021 | TMA | 0 |
AMCR | 6/21/2021 | TMA | 0 |
AMCR | 6/29/2021 | TMA | 0 |
AMCR | 6/1/2021 | TMC | 0 |
AMCR | 6/7/2021 | TMC | 0 |
AMCR | 6/15/2021 | TMC | 0 |
AMCR | 6/21/2021 | TMC | 0 |
AMCR | 6/29/2021 | TMC | 0 |
AMCR | 6/1/2021 | SRA | 0 |
AMCR | 6/7/2021 | SRA | 0 |
AMCR | 6/15/2021 | SRA | 0 |
AMCR | 6/21/2021 | SRA | 0 |
AMCR | 6/29/2021 | SRA | 0 |
AMCR | 6/1/2021 | SRC | 0 |
AMCR | 6/7/2021 | SRC | 0 |
AMCR | 6/15/2021 | SRC | 0 |
AMCR | 6/21/2021 | SRC | 0 |
AMCR | 6/29/2021 | SRC | 0 |
AMCR | 6/1/2021 | MCC | 0 |
AMCR | 6/7/2021 | MCC | 0 |
AMCR | 6/15/2021 | MCC | 0 |
AMCR | 6/21/2021 | MCC | 0 |
AMCR | 6/29/2021 | MCC | 0 |
Is there a way to add rows with 0 counts by checking to see if those date and site combinations don't exist?
Thank you.
library(dplyr)
library(tidyr)
dat %>%
complete(Species, Date, Site, fill = list(n = 0))
# # A tibble: 15 x 4
# Species Date Site n
# <chr> <chr> <chr> <dbl>
# 1 AMCR 6/1/2021 BMA 1
# 2 AMCR 6/1/2021 SVA 0
# 3 AMCR 6/1/2021 SVC 14
# 4 AMCR 6/15/2021 BMA 0
# 5 AMCR 6/15/2021 SVA 9
# 6 AMCR 6/15/2021 SVC 0
# 7 AMCR 6/21/2021 BMA 0
# 8 AMCR 6/21/2021 SVA 18
# 9 AMCR 6/21/2021 SVC 0
# 10 AMCR 6/29/2021 BMA 0
# 11 AMCR 6/29/2021 SVA 18
# 12 AMCR 6/29/2021 SVC 0
# 13 AMCR 6/7/2021 BMA 0
# 14 AMCR 6/7/2021 SVA 2
# 15 AMCR 6/7/2021 SVC 0
dat2 <- merge(dat, do.call(expand.grid, lapply(dat[,1:3], unique)), by = names(dat)[1:3], all = TRUE)
dat2
# Species Date Site n
# 1 AMCR 6/1/2021 BMA 1
# 2 AMCR 6/1/2021 SVA NA
# 3 AMCR 6/1/2021 SVC 14
# 4 AMCR 6/15/2021 BMA NA
# 5 AMCR 6/15/2021 SVA 9
# 6 AMCR 6/15/2021 SVC NA
# 7 AMCR 6/21/2021 BMA NA
# 8 AMCR 6/21/2021 SVA 18
# 9 AMCR 6/21/2021 SVC NA
# 10 AMCR 6/29/2021 BMA NA
# 11 AMCR 6/29/2021 SVA 18
# 12 AMCR 6/29/2021 SVC NA
# 13 AMCR 6/7/2021 BMA NA
# 14 AMCR 6/7/2021 SVA 2
# 15 AMCR 6/7/2021 SVC NA
dat2$n <- ifelse(is.na(dat2$n), 0, dat2$n)
dat2
# Species Date Site n
# 1 AMCR 6/1/2021 BMA 1
# 2 AMCR 6/1/2021 SVA 0
# 3 AMCR 6/1/2021 SVC 14
# 4 AMCR 6/15/2021 BMA 0
# 5 AMCR 6/15/2021 SVA 9
# 6 AMCR 6/15/2021 SVC 0
# 7 AMCR 6/21/2021 BMA 0
# 8 AMCR 6/21/2021 SVA 18
# 9 AMCR 6/21/2021 SVC 0
# 10 AMCR 6/29/2021 BMA 0
# 11 AMCR 6/29/2021 SVA 18
# 12 AMCR 6/29/2021 SVC 0
# 13 AMCR 6/7/2021 BMA 0
# 14 AMCR 6/7/2021 SVA 2
# 15 AMCR 6/7/2021 SVC 0
Data
dat <- structure(list(Species = c("AMCR", "AMCR", "AMCR", "AMCR", "AMCR", "AMCR"), Date = c("6/1/2021", "6/1/2021", "6/7/2021", "6/15/2021", "6/21/2021", "6/29/2021"), Site = c("SVC", "BMA", "SVA", "SVA", "SVA", "SVA"), n = c(14L, 1L, 2L, 9L, 18L, 18L)), class = "data.frame", row.names = c(NA, -6L))