I have a similar question like this:
Weighted sampling with 2 vectors
I now have a dataset which contains 1000 observations and 4 columns for each observation. I want to sample 200 observations from the original dataset with replacement.
But the PROBLEM is: I need to assign different probability vector for each column. For example, for the first column. I want equal probability c(0.001,0.001,0.001,0.001...). For the second column, I want something different like c(0.0005,0.0002,......). Of course, each probability vector sum up to 1.
I know sample can do with one vector. But I am not sure about other commands. Please HELP me!
Thank you in advance! Colamonkey
# in your case the rows are 1000 and the columns 4,
# but it is just to show the procedure
samp_prob <- data.frame(A = rep(.25, 4), B = c(.5, .1, .2, .2), C = c(.3, .6, .05, .05))
df <- data.frame(a = 1:4, b = 2:5, c = 3:6)
sam <- mapply(function(x, y) sample(x, 200, T, y), df, samp_prob)
head(sam)
a b c
[1,] 4 5 6
[2,] 1 2 4
[3,] 1 2 4
[4,] 4 4 4
[5,] 4 4 4
[6,] 1 2 4
# you can also write (it is equivalent):
mapply(df, samp_prob, FUN = sample, size = 200, replace = T)