Given an array, how can I extract n
non overlapping random samples of size m
from it?
For example, given the array:
const arr = [1, 2, 3, 4, 5, 6, 7, 8];
calling sample(arr, 3, 2)
would for example return [[7, 8], [4, 5], [2, 3]]
, calling sample(arr, 2, 4)
would necessarily return [[1, 2, 3, 4], [5, 6, 7, 8]
, and calling sample(arr, 5, 2)
would throw an error.
EDIT - Maybe this wasn't clear in the initial question: samples should be lists of contiguous elements. That is why sample(arr, 2, 4)
can only return [[1, 2, 3, 4], [5, 6, 7, 8]
and not [[2, 3, 1, 6], [5, 4, 7, 8]
, for example.
You could start off by first creating a list with the format of the return value:
[ 1, 2, 3, 4, 5, 6, 7, 8]
[<---->, <---->, <---->, <>, <>] // sample(array, 3, 2)
[<------------>, <------------>] // sample(array, 2, 4)
These format arrays could be written out using the lengths:
[1, 2, 3, 4, 5, 6, 7, 8]
[ 2, 2, 2, 1, 1] // sample(array, 3, 2)
[ 4, 4] // sample(array, 2, 4)
Then shuffle the format arrays to gain a random sample selection:
[1, 2, 3, 4, 5, 6, 7, 8]
[ 2, 1, 2, 2, 1] // sample(array, 3, 2)
[ 4, 4] // sample(array, 2, 4)
Then for each element of the format array, remove the the first n
elements from the input array. Then store them unless it was a filler (one size chunks that are put in to reach the array length).
[1, 2, 3, 4, 5, 6, 7, 8]
[[1,2], [4,5], [6,7]] // sample(array, 3, 2)
[[1,2,3,4], [5,6,7,8]] // sample(array, 2, 4)
Lastly shuffle the resulting samples.
[1, 2, 3, 4, 5, 6, 7, 8]
[[4,5], [1,2], [6,7]] // sample(array, 3, 2)
[[5,6,7,8], [1,2,3,4]] // sample(array, 2, 4)
const arr = [1, 2, 3, 4, 5, 6, 7, 8];
console.log(sample(arr, 3, 2));
console.log(sample(arr, 2, 4));
console.log(sample(arr, 5, 2));
function randomInt(limit) {
return Math.floor(Math.random() * limit);
}
function shuffle(array) {
for (let limit = array.length; limit > 0; --limit)
array.push(...array.splice(randomInt(limit), 1));
}
function sample(array, sampleCount, sampleLength) {
let elementCount = sampleCount * sampleLength;
if (elementCount > array.length)
throw "invalid sampleCount/sampleLength arguments";
const filler = {valueOf: () => 1};
const fillerCount = array.length - elementCount;
const lengths = Array.from(
{length: sampleCount + fillerCount},
(_, i) => i < sampleCount ? sampleLength : filler
);
shuffle(lengths);
const samples = Array.from(array);
for (const length of lengths) {
const sample = samples.splice(0, length);
if (length === filler) continue;
samples.push(sample);
}
shuffle(samples);
return samples;
}
Note that ===
is important in length === filler
. If you use ==
, filler
would also equal 1
. This would then conflict with a call like sample(array, 5, 1)
where each sample length is 1
.
const filler = {valueOf: () => 1};
console.log("1 == filler //=>", 1 == filler);
console.log("2 == filler //=>", 2 == filler);
console.log("filler == filler //=>", filler == filler);
console.log("1 === filler //=>", 1 === filler);
console.log("2 === filler //=>", 2 === filler);
console.log("filler === filler //=>", filler == filler);