Search code examples
rmedianmean

Generate numbers in R


In R, how can I generate N numbers that have a mean of X and a median of Y (at least close to).

Or perhaps more generally, is there an algorithm for this?


Solution

  • There is an infinite number of solutions.

    Approximate algorithm:

    1. Generate n/2 numbers below the median
    2. Generate n/2 numbers above the median
    3. Add you desired median and check
    4. Add one number with enough weight to satisfy your mean -- which you can solve

    Example assuming you want a median of zero and a mean of twenty:

    R> set.seed(42)
    R> lo <- rnorm(10, -10); hi <- rnorm(10, 10)
    R> median(c(lo,0,hi))
    [1] 0                         # this meets our first criterion
    R> 22*20 - sum(c(lo,0,hi))    # (n+1)*desiredMean - currentSum
    [1] 436.162                   # so if we insert this, we the right answer
    R> mean(c(lo,0,hi,22*20 - sum(c(lo,0,hi))))
    [1] 20                        # so we meet criterion two
    R> 
    

    because desiredMean times (n+1) has to be equal to sum(currentSet) + x so we solve for x getting the expression above.