Adding a column indicating current count of non-missing rows for the same ID

I have a quick question about counting non-missing entries of a column. Let's say I have the data that looks like:

data<-data.frame(id=c(1,1,1,1,2,2,2,3,3,3,3),var1=c(NA,2,5,3,NA,NA,6,4,4,NA,7))

How do I add a new column counting the current number non-missing var1 for each ID (as below)?

data<-data.frame(id=c(1,1,1,1,2,2,2,3,3,3,3),var1=c(NA,2,5,3,NA,NA,6,4,4,NA,7),count_nm=c(NA,1,2,3,NA,NA,1,1,2,NA,3))

The best I could do was to delete rows with var1==NA, and add the count for each ID. But I would like to know how to do it without deleting those rows. Thanks!

Solution

You can use cumsum on the complete.cases:

library(dplyr)
data |> 
  mutate(count_nm = cumsum(complete.cases(var1)), .by = id)

I also like the convenient collapse::fcumsum function which has a na.rm argument.

library(dplyr)
data |> 
 mutate(count_nm = collapse::fcumsum(var1 > 0, na.rm = TRUE), .by = id)

Rcpp Rf_warningcall compiler warnings
Modify the name of factor variables in lm function(summary function)
Extreme value analysis and quantile estimation using log Pearson type 3 (Pearson III) distribution - R vs Python
How to hide NAs when using xlsx::saveWorkbook?
How do I retrieve a simple numeric value from a named numeric vector in R?
Matching pair-wise columns from left to right across rows in one dataframe to another dataframe and adding new columns with matching values
Income to outcome flow chart in Sankey plotly R
color mapping in geom_conn_bundle not showing correctly
Print R package startup message AFTER automatic package conflict messages instead of before
Summing a set of R dataframe rows (column-wise), while retaining the first n columns
Added variable / partial regression plots for groups in an interaction?
how to make a topoplot in R with coordinates variable distribution
List of all functions in base R?
Plotting multiple plots for different initial conditions in one graph
Printing repetitively on the same line in R
Generating UI/Server based on initial selection
Subset dataframe based on pickerInput
How to let user pick the data in R-shiny?
Couldn't show my simple bar charts separately on Shiny R dashboardBody
How to programmatically filter contents of a second shiny app displayed via iframe
How to select specific interesting groups for the boxplot in R Shiny app?
Crosstable and Plot grouping with reactive values
Is there a way to make multiple Shiny picker inputs where the selections must be disjoint?
Delay/avoid duplication of shiny server side functions until after credentials
Predictions only returns value "1"
How to display a busy indicator in a shiny app?
Append doesn't work when writing to CSV in R
Changing the start date of a gantt chart in DiagrammeR
Check for installed packages before running install.packages()
Compare two columns element-wise