Search code examples
rdistributionsampling

Sampling from a distribution the 10% of values around the peak value in R


I have a vector of 100 values.

bingo<-sample(0:5000, 100)

plot(density(bingo))

which.max(density(bingo)$y) # [1] 194

density(bingo)$x[194] # [1] 1507.085

I wanto to sample the 10% of the values around 1507.085, the peak of my distribution. Ho can I achieve it?

Many thanks for any advice!


Solution

  • It depends on what you mean by "around". Here are a couple of options:

    set.seed(94)
    bingo <- sample(0:5000, 100)
    bingo <- sort(bingo)
    bingo_mode <- density(bingo)$x[which.max(density(bingo)$y)]
    idxMode <- match(TRUE, bingo > bingo_mode)
    
    # to sample from 5 before and 5 after the peak (or the top/bottom decile if the
    # peak is in the 95th/5th percentile)
    if (idxMode < 5) {
      idxFrom <- 1
      idxTo <- 10
    } else if (idxMode + 4 > 100) {
      idxFrom <- 91
      idxTo <- 100
    } else {
      idxFrom <- idxMode - 5
      idxTo <- idxMode + 4
    }
    
    sample(bingo[idxFrom:idxTo], 1)
    #> [1] 2557
    
    # to sample from the 10 nearest the peak
    if (idxMode < 10) {
      idxFrom <- 1
      idxTo <- 20
    } else if (idxMode + 9 > 100) {
      idxFrom <- 81
      idxTo <- 100
    } else {
      idxFrom <- idxMode - 10
      idxTo <- idxMode + 9
    }
    
    sample(bingo[idxFrom:idxTo][order(abs(bingo_mode - bingo[idxFrom:idxTo]))[1:10]], 1)
    #> [1] 2678