In R, I am looking to conditionally subset my dataset so I can apply the same function to different groups of data. Here are dummy data:
data <- data.frame(id = seq(1, 100, by = 1),
sex = sample(c('M', 'F'), 100, replace = TRUE),
age_class = sample(c('A', 'S'), 100, replace = TRUE), # A = adult, S = subadult
season = sample(c('spring', 'autumn'), 100, replace = TRUE),
den_status = sample(c(0,1), 100, replace = TRUE), # 1 = yes, 0 = no. Only females can den and get a 1 or 0, males are all dummy coded as 0
weight = sample(80:600, 100, replace = TRUE),
offspring = sample(c('Y','N'), 100, replace = TRUE),
albumin = rnorm(100, 5, 2),
cortisol = rnorm(100, 30, 12),
calcium = rnorm(100, 0.3, 0.005),
globulin = rnorm(100, 1.9, 0.3),
insulin = rnorm(100, 3, 0.13))
Preliminarily, I grouped the data by sex, age_class, and den_status and applied a custom function called mod.zscore per grouping. The function acts on columns albumin:insulin and then I created new columns that contain the output data.
data.new <- data %>%
group_by(sex, age_class, den_status) %>%
mutate(across(c(albumin:insulin), mod.zscore,
.names = "{.col}_{'zscore'}")) %>% ungroup()
This works fine and does what I need it to do. Where I'm stuck is that I need to conditionally subset or group the data so that I only group by den_status when sex == 'female', age_class == 'A', and season == 'spring'. Currently, my code groups both males and females by den_status, which is not necessarily a problem because all males have den_status = 0 anyway. The problem arises in that I only want den_status to apply to spring females.
Basically, I want these groupings:
Any help is greatly appreciated. Thank you!
EDIT: I think I'm looking for a solution that will not create new columns because I will need to again work in terms of columns sex, age_class, den_status, and season.
library(dplyr)
data.new <- data %>%
mutate(season2=case_when(sex == "F" & age_class == "A" & season == "spring" & den_status == 0~ "spring0",
sex == "F" & age_class == "A" & season == "spring" & den_status == 1~ "spring1",
TRUE~ season)) %>%
group_by(sex,age_class,season2)%>%
mutate(n=n())