Search code examples
rdataframedata-analysis

How do I add up all of the values of variable for a person in R?


I am new to R and am trying to look at the total exposure a patient has had in their lifetime. Here is how the data is laid out currently:

  patient_id  exposure
1 p01         1.4        
2 p01         3.2         
3 p02         0          
4 p02         6.4       
5 p02         8.1        
6 p03         2.8

Here is how I would like the output to be:

  patient_id   total_exposure
1 p01          4.6        
2 p02          14.5                 
3 p03          2.8

Here is what I have tried:

 total_exposure <- yearly_exposure %>%
  group_by(patient_id) %>%
  mutate(exposure = cumsum(exposure)) %>%
  top_n(-1)

For some reason, it is not giving combining the multiple occurrences of each patient into one line and it is only showing the first exposure value in the output.

Thank you for the help!


Solution

  • This is probably a cleaner way to accomplish what you're looking for:

    library(dplyr)
    total_exposure %>% 
      group_by(patient_id) %>% 
      summarize(exposure = sum(exposure))
    

    Another approach, more similar to yours (using mutate) could be:

    total_exposure %>% 
      group_by(patient_id) %>% 
      mutate(exposure = sum(exposure)) %>% 
      distinct()