Search code examples
rcontingency

Getting values that appear exactly n-times


I specifically started to think in this problem trying to get the values form a vector that were not repeated. unique is not good (up to what I could collect from the documentation) because it gives you repeated elements, but only once. duplicated has the same problem since it gives you FALSE the first time it finds a value that is duplicated. This was my workaround

> d=c(1,2,4,3,4,6,7,8,5,10,3)
> setdiff(d,unique(d[duplicated(d)]))
[1]  1  2  6  7  8  5 10

The following is a more general approach

> table(d)->g
> as.numeric(names(g[g==1]))
[1]  1  2  5  6  7  8 10

which we can generalize to other value than 1. But I find this solution a bit clumsy, transforming strings to numbers. Is there a better or more straightforward way to get this vector?


Solution

  • You could sort the values, then use rle to get the values that appear n times consecutively.

    rl <- rle(sort(d))
    
    rl$values[rl$lengths==1]
    ## [1]  1  2  5  6  7  8 10
    
    rl$values[rl$lengths==2]
    ## [1] 3 4