I have a data set looks like this:
library(data.table)
dt <- data.table(id = c("A", "A", "A", "B", "B", "B", "C", "C", "C"), Complete = c("Yes","No","Yes","Yes","No","Yes","Yes","Yes","Yes"))
> dt
id Complete
1: A Yes
2: A No
3: A Yes
4: B Yes
5: B No
6: B Yes
7: C Yes
8: C Yes
9: C Yes
I would like to build var N_complete to capture the total count for complete=="Yes"
by ID, The final data should looks like following. What should I do in order to achieve such results?
I tried
dt$N_complete <- unlist(lapply(split(dt,dt$ID), function(x) rep(summarize(n(x)[x$Complete=="Yes"],na.rm=T),nrow(x))))
Sorry for the mess. I am a beginner and my codes error might looks very silly.
Since you are using data.table
you can easily compute the complete cases (number of 'Yes' entries) by group using:
dt[, N_complete := sum(Complete == "Yes", na.rm = TRUE), by = .(id)]