I am trying to summarise a dataframe based on grouping by label column. I want to obtain means based on the following conditions:
- if all numbers are NA
- then I want to return NA
- if mean of all the numbers is 1
or lower - I want to return 1
- if mean of all the numbers is higher than 1
- I want a mean of the values in the group that are greater than 1
- all the rest should be 100
.
Managed to find the answer and now my code is running well - is.na()
should be there instead of ==NA
in the first ifelse()
statement and that was the issue.
label <- c(1,1,1,2,2,2,3,3,3,4,4,4,5,5,5,6,6,6,7,7,7)
sev <- c(NA,NA,NA,NA,1,0,1,1,1,NA,1,2,2,4,5,1,0,1,1,4,5)
Data2 <- data.frame(label,sev)
d <- Data2 %>%
group_by(label) %>%
summarize(sevmean = ifelse(is.na(mean(sev,na.rm=TRUE)),NA,
ifelse(mean(sev,na.rm=TRUE)<=1,1,
ifelse(mean(sev,na.rm=TRUE)>1,
mean(sev[sev>1],na.rm=TRUE),100))))
Your first condition is the issue here. If we remove the nested ifelse
and keep only the first one, we get the same output
Data2 %>%
group_by(label) %>%
summarise(sevmean = ifelse(mean(sev,na.rm=TRUE)==NaN,NA,1))
# label sevmean
# <dbl> <lgl>
#1 1.00 NA
#2 2.00 NA
#3 3.00 NA
#4 4.00 NA
#5 5.00 NA
#6 6.00 NA
#7 7.00 NA
I am not sure why you are checking NaN
but if you want to do that , check it with is.nan
instead of ==
Data2 %>%
group_by(label) %>%
summarize(sevmean = ifelse(is.nan(mean(sev,na.rm=TRUE)),NA,
ifelse(mean(sev,na.rm=TRUE)<=1,1,
ifelse(mean(sev,na.rm=TRUE)>1,
mean(sev[sev>1],na.rm=TRUE),100))))
# label sevmean
# <dbl> <dbl>
#1 1.00 NA
#2 2.00 1.00
#3 3.00 1.00
#4 4.00 2.00
#5 5.00 3.67
#6 6.00 1.00
#7 7.00 4.50