I have a dataset and I'm trying to count the number of codes each patient has, as well as the number of codes of interest that each patient has.
Let's say that I have this table below and my code of interest is 26.
patient code
1 25
1 26
1 39
1 26
1 86
2 26
2 24
2 89
3 56
3 45
3 26
3 89
4 56
4 25
4 66
4 56
Patient 1 total code: 5 total codes, 2 codes of interest
Patient 2 total code: 3 total codes, 1 code of interest
Patient 3 total code: 4 total codes, 1 code of interest
Patient 4 total code: 4 total codes, 0 codes of interest
How can I do this in R? Thank you!
Here's a tidyverse
approach.
First you need to group_by(patient)
so that R
will calculate patient
as a group. Then use summarise()
to calculate the count of codes n()
in each patient, and also count the occurrence of 26
in each patient (sum(code == 26)
).
library(tidyverse)
df %>% group_by(patient) %>%
summarize(Total_codes = n(),
Codes_of_interest = sum(code == 26))
# A tibble: 4 x 3
patient Total_codes Codes_of_interest
<int> <int> <int>
1 1 5 2
2 2 3 1
3 3 4 1
4 4 4 0