Search code examples
rdataframefilterunique

How to filter a dataset to have unique values in a column without deleting the already unique values in the targeted column?


I have this dataframe that I need to make filtering as follows

library(dplyr)
library(tidyverse)

df = data.frame(x = rep(1,20), y = c(rep(2,5),rep(3,5),11:20) )

df = df%>%group_by(x)%>%filter(y==unique(y))

however the output is

df = df%>%group_by(x)%>%filter(y==unique(y))
Warning message:
In y == unique(y) :
  longer object length is not a multiple of shorter object length
> df
# A tibble: 1 × 2
# Groups:   x [1]
      x     y
  <dbl> <dbl>
1     1     2

The result I am looking for is as follows

   x  y
1  1  2
2  1  3
3  1 11
4  1 12
5  1 13
6  1 14
7  1 15
8  1 16
9  1 17
10 1 18
11 1 19
12 1 20

Thanks


Solution

  • I think you're looking for distinct().

    library(dplyr)
    
    df = data.frame(x = rep(1,20), y = c(rep(2,5),rep(3,5),11:20) )
    
    df %>% group_by(x) %>% distinct(y)
    #> # A tibble: 12 × 2
    #> # Groups:   x [1]
    #>        x     y
    #>    <dbl> <dbl>
    #>  1     1     2
    #>  2     1     3
    #>  3     1    11
    #>  4     1    12
    #>  5     1    13
    #>  6     1    14
    #>  7     1    15
    #>  8     1    16
    #>  9     1    17
    #> 10     1    18
    #> 11     1    19
    #> 12     1    20
    

    Created on 2023-02-20 with reprex v2.0.2