Search code examples
rselectrandomstatistics-bootstrap

selecting two random numbers via bootstrapping


I have a dataset of 1020 size measurements. I would need to create a new dataset based on these 1020 numbers, by randomly taking out numbers with replacement. However, I need to do this random sampling in the following way:

  1. Taking out randomly two numbers from the original dataset.
  2. Selecting the number that is larger of these two random numbers.
  3. Getting this larger number into the new dataset.
  4. Repeating steps 1-3 that many times that I have a new dataset with 1020 sizes (like in the original dataset), and that I have in total 10000 new datasets with 1020 sizes.

I do manage to create 10000 new datasets based on the original dataset by randomly picking out numbers from the original dataset with bootstrapping method:

a <- numeric(10000)
for(i in 1:10000) a[i] <- sample(size, replace = T)

But I do not know, how to use this command above to get two random numbers, selecting the bigger one, and having this bigger one in new dataset.

Could it be something following?

b <- numeric(10000)
for(i in 1:10000) b[i] <- sample(size, 2, ......, replace = T))

And then have some command (which I do not know) there were the dots are to get bigger number out of two into new datasets?


Solution

  • I think this might do what you want. y1 will contain all of your first draws in a pair and y2 will contain all of the second. the pmax function takes the larger of each of these and the matrix command puts the data into a matrix with 1020 rows and 10000 columns. You might want to replace some of these 'magic' numbers with variables in your script so that you can easily try small samples for testing purposes.

    y1 <- sample(data, 1020 * 10000, replace = TRUE)
    y2 <- sample(data, 1020 * 10000, replace = TRUE)
    
    bigDat <- matrix( pmax(y1, y2), nrow = 1020)