I have a data set like this:
PatientNum<- c(1, 2, 3, 4, 5,6,7)
Age<- c(10, 21, 51, 42, 35,99,25)
SurgicalProcedureNumber <- c(21356, 21424, 221356, NA, 12345, 54321,NA)
OtherSurgicalProcedureNumber <- c(54321, NA, 21356, 12345, NA,12345, 21424)
BloodLoss<-c(5,4,5,10,5,15,9)
YetAnotherSurgicalProcedureNumber<-c(11111,22222,NA, 33333,21356,555555,NA)
dataset <- data.frame(PatientNum, Age, SurgicalProcedureNumber,
OtherSurgicalProcedureNumber, BloodLoss,YetAnotherSurgicalProcedureNumber)
There were certain surgical procedures I cared about that I wanted to filter for:
NumbersIcareAbout<- c(21356,21424)
I filtered the data over multiple columns to detect any time one of the numbers I cared about popped up. I saved that result as a new dataset of patients with those procedures done. Did so like this:
NewData<-dataset %>%
filter(SurgicalProcedureNumber %in% NumbersIcareAbout |
OtherSurgicalProcedureNumber %in% NumbersIcareAbout |
YetAnotherSurgicalProcedureNumber %in% NumbersIcareAbout)
But now I want to find out in the new data, what OTHER surgical procedure numbers (which were in multiple columns) showed up and how many of them. I.e. its not a blank NA, and its not just one of the surgical procedures I originally filtered for.
For instance the result I'm looking for would show that in these select patients who had 21356 and/or 21424 done to them, they also had ##### and ##### procedures done a certain number of times, which might be spread out over multiple columns.
Any help would be appreciated, thank you.
How about this?
library(tidyr)
NewData %>%
gather(key="typeofProcedure",
value = "procedureNr",
contains("Procedure")) %>%
filter(!is.na(procedureNr),
!procedureNr %in% NumbersIcareAbout) %>%
group_by(PatientNum,procedureNr) %>%
summarise(n = n())
hope it helps!!