Search code examples
rdataframeunique

Keep only series with non-unique values in data frame


I have data following the following pattern:

uid  result
A    FALSE
A    TRUE
A    FALSE
B    FALSE
B    FALSE
B    FALSE
C    TRUE
C    TRUE
C    TRUE

I would like to filter out rows grouped by uid for which the value of result is always the same. In this following example, rows with uid = B and C will be filtered out, since their corresponding result columns only contain respectively FALSE and TRUE.

On the contrary, rows with uid = A will be kept since it contains both TRUE and FALSE values.


Solution

  • library(dplyr)
    your_data |> filter(n_distinct(result) > 1, .by = uid)
    #   uid result
    # 1   A  FALSE
    # 2   A   TRUE
    # 3   A  FALSE
    

    Using this sample data:

    your_data = read.table(text = 'uid  result
    A    FALSE
    A    TRUE
    A    FALSE
    B    FALSE
    B    FALSE
    B    FALSE
    C    TRUE
    C    TRUE
    C    TRUE', header = T)