Search code examples
rloopssampleseqmapply

Sample using start and end values within a loop in R


I am trying to sample between a range of values as part of a larger loop in R. As the loop progresses to each row j, I want to sample a number between the value given in the start column and the value given in the end column, placing that value in the sampled column for that row.

The results should look something like this:

ID  start  end  sampled
a   25     67   44
b   36     97   67
c   23     85   77
d   15     67   52
e   21     52   41
f   43     72   66
g   39     55   49
h   27     62   35
i   11     99   17
j   21     89   66
k   28     65   48
l   44     58   48
m   16     77   22
n   25     88   65

I started using mapply, which samples the whole df, but then I'm trying to fit all 15 sampled values into a single row.

df[j,4] <- mapply(function(x, y) sample(seq(x, y), 1), df$start, df$end)

I thought maybe something using seq might work, but this results in errors saying that from must be of length 1.

df[j,4] <- sample(seq(df$start, df$end),1,replace=TRUE)

The outer looping structure is pretty complicated so I haven't included it here, but the df[j,4] part of the code is necessary because it is part of a larger loop. There are situations where rows have to be resampled based on additional dependencies in the actual dataset. For example, the sampled value of a might need to be larger than b. The rest of the code updates the sampled column, checks for dependencies, and will rerun the sample if the dependencies aren't met. If I can get this sampling section to work, I should be able to plug it in without too much trouble (I hope).

Here's a sample data set.

structure(list(ID = c("a", "b", "c", "d", "e", "f", "g", "h", 
"i", "j", "k", "l", "m", "n"), start = c(25, 36, 23, 15, 21, 
43, 39, 27, 11, 21, 28, 44, 16, 25), end = c(67, 97, 85, 67, 
52, 72, 55, 62, 99, 89, 65, 58, 77, 88), sampled = c(NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), class = c("spec_tbl_df", 
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -14L), spec = structure(list(
    cols = list(ID = structure(list(), class = c("collector_character", 
    "collector")), start = structure(list(), class = c("collector_double", 
    "collector")), end = structure(list(), class = c("collector_double", 
    "collector")), sampled = structure(list(), class = c("collector_logical", 
    "collector"))), default = structure(list(), class = c("collector_guess", 
    "collector")), skip = 1), class = "col_spec"))```

Solution

  • Figured it out. df[j,4] <- mapply(function(x, y) sample(seq(x, y), 1), df[j,"start"], df[j,"end"])

    I just needed to be specific as to which row of the sampled values I wanted to enter into df[j,4]. Specifying row j for columns start and end did the trick.