I have a df like this one :
ID matching_variable status
1 1 case
2 1 control
3 2 case
4 2 case
5 3 control
6 3 control
7 4 case
8 4 control
9 5 case
10 6 control
I would like to keep all my "pairs" of subjects that are matched (that have the same matching variable) and for which there is 1 case and 1 control (such as the pair corresponding to matching variable = 1 or to maching variable = 4)
So, I would like to remove the matched subjects for which there are only cases (such as matching_variable =2) or only controls (such as matching_variable =3) and the subjects that are alone (that have not been matched) (such as the last 2 subjects)
The expected result would be this:
ID matching_variable status
1 1 case
2 1 control
7 4 case
8 4 control
I'm sure it's not too complicated but I have no idea how to go about it...
Thanks in advance for the help
An idea via base R,
df[as.logical(with(df, ave(status, matching_variable, FUN = function(i)length(unique(i)) > 1))),]
ID matching_variable status
1 1 1 case
2 2 1 control
7 7 4 case
8 8 4 control