Search code examples
rrandomsample

R How to generate sample number by splitting a given number in a specific times with random ratio


Goal to create a random dataset by distribute a designed TOTAL number into a specific parts (times) with a random ratio .

e.g.

Given number 
=100

Divide in a specific times
n= 5

Random ratio : 5 
ratio = c(0.1,0.04,0.01,0.3,0.55) # assume there is a function to randomly generated this

Respectively equal to 
100*0.1= 10 ,
100*0.04 = 4,
100*0.01= 1,
100*0.3= 30,
100*0.55= 55

then the total after divided by this ratio will return back to given number 100.

Solution are separately recognised but don't know how to combine them in a efficient way :

# generate random number within a range
sample(x:xx)
# generate number by only 1 decided ratio but NOT divided evenly based on a specific time .
seq(from=0, to=1, by=0.1) 

Solution

  • This is an example function that will split the first x numbers into the desired ratios. We can confirm that it worked properly with reduce.

    split_sample=function(x, ratio) {
      my_list=list()
      y=1:x
      for(i in 1:length(ratio)) {
        my_sample=sample(1:length(y), floor(ratio[i]*x))
        my_list=c(my_list, list(y[my_sample]))
        y=y[-my_sample]
      }
      return(my_list)
    }
    x=100
    n=5
    ratio = c(0.1,0.04,0.01,0.3,0.55)
    my_list=split_sample(x, ratio)
    my_list
    
    [[1]]
     [1] 44 54 49 67 21 29 50 68 47 52
    
    [[2]]
    [1] 100  70  30  94
    
    [[3]]
    [1] 99
    
    [[4]]
     [1] 25 31  3 32 71 83 78 11 36 23 86 93  9 37 74 81  8 95 39 45 92  4 10 48 82 64 63 79 72 96
    
    [[5]]
     [1]  2 88 58 38 34 61 51 26 57 40 59 75 17 87 41 73 66 55 24  7 56 19 27 12 28  6 98 80 60 89  5 46
    [33] 91 90 13 76 77 33 14 85 43 16 84 65 53 35 15 42 20  1 69 62 22 18 97
    
    n_distinct(reduce(my_list, union))
    100