Search code examples
javascriptarraysrandomsample

Get n non overlapping m-sized samples from an array


Given an array, how can I extract n non overlapping random samples of size m from it?

For example, given the array:

const arr = [1, 2, 3, 4, 5, 6, 7, 8];

calling sample(arr, 3, 2) would for example return [[7, 8], [4, 5], [2, 3]], calling sample(arr, 2, 4) would necessarily return [[1, 2, 3, 4], [5, 6, 7, 8], and calling sample(arr, 5, 2) would throw an error.

EDIT - Maybe this wasn't clear in the initial question: samples should be lists of contiguous elements. That is why sample(arr, 2, 4) can only return [[1, 2, 3, 4], [5, 6, 7, 8] and not [[2, 3, 1, 6], [5, 4, 7, 8], for example.


Solution

  • You could start off by first creating a list with the format of the return value:

    [ 1,  2,  3,  4,  5,  6,  7,  8]
    [<---->, <---->, <---->, <>, <>] // sample(array, 3, 2)
    [<------------>, <------------>] // sample(array, 2, 4)
    

    These format arrays could be written out using the lengths:

    [1, 2, 3, 4, 5, 6, 7, 8]
    [   2,    2,    2, 1, 1] // sample(array, 3, 2)
    [         4,          4] // sample(array, 2, 4)
    

    Then shuffle the format arrays to gain a random sample selection:

    [1, 2, 3, 4, 5, 6, 7, 8]
    [   2, 1,    2,    2, 1] // sample(array, 3, 2)
    [         4,          4] // sample(array, 2, 4)
    

    Then for each element of the format array, remove the the first n elements from the input array. Then store them unless it was a filler (one size chunks that are put in to reach the array length).

    [1, 2, 3, 4, 5, 6, 7, 8]
    [[1,2], [4,5], [6,7]]  // sample(array, 3, 2)
    [[1,2,3,4], [5,6,7,8]] // sample(array, 2, 4)
    

    Lastly shuffle the resulting samples.

    [1, 2, 3, 4, 5, 6, 7, 8]
    [[4,5], [1,2], [6,7]]  // sample(array, 3, 2)
    [[5,6,7,8], [1,2,3,4]] // sample(array, 2, 4)
    

    const arr = [1, 2, 3, 4, 5, 6, 7, 8];
    console.log(sample(arr, 3, 2));
    console.log(sample(arr, 2, 4));
    console.log(sample(arr, 5, 2));
    
    function randomInt(limit) {
      return Math.floor(Math.random() * limit);
    }
    
    function shuffle(array) {
      for (let limit = array.length; limit > 0; --limit)
        array.push(...array.splice(randomInt(limit), 1));
    }
    
    function sample(array, sampleCount, sampleLength) {
      let elementCount = sampleCount * sampleLength;
      if (elementCount > array.length)
        throw "invalid sampleCount/sampleLength arguments";
        
      const filler = {valueOf: () => 1};
      const fillerCount = array.length - elementCount;
      const lengths = Array.from(
        {length: sampleCount + fillerCount},
        (_, i) => i < sampleCount ? sampleLength : filler
      );
    
      shuffle(lengths);
      const samples = Array.from(array);
      for (const length of lengths) {
        const sample = samples.splice(0, length);
        if (length === filler) continue;
        samples.push(sample);
      }
      shuffle(samples);
      
      return samples;
    }

    Note that === is important in length === filler. If you use ==, filler would also equal 1. This would then conflict with a call like sample(array, 5, 1) where each sample length is 1.

    const filler = {valueOf: () => 1};
    
    console.log("1 == filler       //=>", 1 == filler);
    console.log("2 == filler       //=>", 2 == filler);
    console.log("filler == filler  //=>", filler == filler);
    console.log("1 === filler      //=>", 1 === filler);
    console.log("2 === filler      //=>", 2 === filler);
    console.log("filler === filler //=>", filler == filler);