I have the data frame:
structure(list(Loc_name = c("U3R", "U3R", "U3R", "U3R", "U3R",
"U3R", "HWY301", "HWY301", "HWY301", "HWY301", "HWY301", "HWY301",
"U3R", "U3R", "U3R", "U3R", "U3R", "U3R", "HWY301", "HWY301",
"HWY301", "HWY301", "HWY301", "HWY301"), `fish type` = c("bass",
"bass", "bass", "catfish", "catfish", "catfish", "flathead",
"flathead", "flathead", "bass", "bass", "bass", "bass", "bass",
"bass", "catfish", "catfish", "catfish", "flathead", "flathead",
"flathead", "bass", "bass", "bass"), Report_result_value = c(1.1,
1.2, 1.3, 2.1, 2.2, 2.3, 3.1, 3.2, 3.3, 4.1, 4.2, 4.3, 1.1, 1.2,
1.3, 2.1, 2.2, 2.3, 3.1, 3.2, 3.3, 4.1, 4.2, 4.3), Chemical_Name = c("Cs-137",
"Cs-137", "Cs-137", "Cs-137", "Cs-137", "Cs-137", "Cs-137", "Cs-137",
"Cs-137", "Cs-137", "Cs-137", "Cs-137", "SR-90", "SR-90", "SR-90",
"SR-90", "SR-90", "SR-90", "SR-90", "SR-90", "SR-90", "SR-90",
"SR-90", "SR-90"), detect_flag = c("Y", "Y", "Y", "N", "Y", "N",
"N", "N", "N", "Y", "N", "N", "Y", "Y", "Y", "N", "Y", "N", "N",
"N", "N", "Y", "N", "N")), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -24L))
I would like to get the average report result value of a particular chemical, in a particular fish, at a particular location. The rules for taking the average are: if the detect flag is "N" for a particular fish type, at a particular location, for a particular chemical name, then the mean is zero. Otherwise, the mean is the straight average, regardless of detect flag. For example, the average of SR-90 for fish "Flathead" at HWY301 would be 0. Conversely, the average of Cs-137 for fish "bass" at HWY301 would be 4.2.
I've heard Loops are slow in R, so I'm trying to avoid using a loop as much as possible. Does anyone have an approach in R for calculating the mean considering the rules for averaging above? Much appreciated.
I have tried aggregate and dplyr's summarize function but I have not been able to accommodate the averaging rule with those approaches. I'm fairly new to R.
Based on the comments "there has to be at least one detect_flag = 'Y' for the average to be calculated normally, otherwise it's 0":
fish_data %>%
group_by(Loc_name, `fish type`, Chemical_Name) %>%
summarise(meanVal = ifelse("Y" %in% detect_flag,
mean(Report_result_value),
0)) %>%
ungroup()
Result:
# A tibble: 8 × 4
Loc_name `fish type` Chemical_Name meanVal
<chr> <chr> <chr> <dbl>
1 HWY301 bass Cs-137 4.2
2 HWY301 bass SR-90 4.2
3 HWY301 flathead Cs-137 0
4 HWY301 flathead SR-90 0
5 U3R bass Cs-137 1.2
6 U3R bass SR-90 1.2
7 U3R catfish Cs-137 2.2
8 U3R catfish SR-90 2.2