I have a quick question about counting non-missing entries of a column. Let's say I have the data that looks like:
data<-data.frame(id=c(1,1,1,1,2,2,2,3,3,3,3),var1=c(NA,2,5,3,NA,NA,6,4,4,NA,7))
How do I add a new column counting the current number non-missing var1 for each ID (as below)?
data<-data.frame(id=c(1,1,1,1,2,2,2,3,3,3,3),var1=c(NA,2,5,3,NA,NA,6,4,4,NA,7),count_nm=c(NA,1,2,3,NA,NA,1,1,2,NA,3))
The best I could do was to delete rows with var1==NA, and add the count for each ID. But I would like to know how to do it without deleting those rows. Thanks!
You can use cumsum
on the complete.cases
:
library(dplyr)
data |>
mutate(count_nm = cumsum(complete.cases(var1)), .by = id)
I also like the convenient collapse::fcumsum
function which has a na.rm
argument.
library(dplyr)
data |>
mutate(count_nm = collapse::fcumsum(var1 > 0, na.rm = TRUE), .by = id)