Search code examples
rdataframesubset

Subsetting data set based on logical test


I would like to subset a dataset based on a logical test.

My data look like:

A  B
1  2
3  4
5  7
2  1

Basically what I would like to do is to separate dataset into two sub-datasets, where one will contain all observations for which for a given row the reverse combination exists. So the desired output would look like:

data1
A B
1 2
2 1

And second:

data2
3 4
5 7

I know that subset() function allows for logical tests but I just don't know how to set up this one in R.


Solution

  • You can find the duplicated rows indices and then subset based on that:

    MySortedData <- data.frame(t(apply(df,1,sort)))
    # X1 X2
    # 1  1  2
    # 2  3  4
    # 3  5  7
    # 4  1  2
    
    MyDuplicates <- duplicated(MySortedData) | duplicated(MySortedData, fromLast=TRUE)
    # [1]  TRUE FALSE FALSE  TRUE
    
    MySubset2 <- df[!MyDuplicates,]
    # A B
    # 2 3 4
    # 3 5 7
    
    MySubset1 <- df[MyDuplicates,]
    # A B
    # 1 1 2
    # 4 2 1