Search code examples
c#pseudocode

Optimizing array that has many elements and different standards


I have a function that takes in X as an argument and randomly picks an element from a 2D array. The 2D array has thousands of elements, each of them has a different requirement on X, stored in arr[Y][1].

For example,

  • arr[0] should only be chosen when X is larger than 4. (arr[0][1] = 4+)

  • Then arr[33] should only be chosen when X is between 37 and 59. (arr[33][1] = 37!59)

  • And arr[490] should only be chosen when X is less than 79. (arr[490][1] = 79-)

    And there are many more, most with a different X requirement.

What is the best way to tackle this problem that takes the least space, and least repetition of elements?

The worst way would be storing possible choices for each X in a 2D array. But that would cause a lot of repetition, costing too much memory.

Then, I have thought about using three arrays, separating X+ requirements, X- and X range. But it still sounds too basic to me, is there a better way?


Solution

  • One option here would be what's called "accept/reject sampling": you pick a random index i and check if the condition on X is satisfied for that index. If so, you return arr[i]. If not, you pick another index at random and repeat until you find something.

    Performance will be good so long as most conditions are satisfied for most values of i. If this isn't the case -- if there are a lot of values of X for which only a tiny number of conditions are satisfied -- then it might make sense to try and precompute something that lets you find (or narrow down) the indices that are allowable for a given X.

    How to do this depends on what you allow as a condition on each index. For instance, if every condition is given by an interval like in the examples you give, you could sort the list twice, first by left endpoints and then by right endpoints. Then determining the valid indices for a particular value of X comes down to intersecting the intervals whose left endpoint is less than or equal to X with those whose right endpoint is greater than or equal to X.

    Of course if you allow conditions other than "X is in this interval" then you'd need a different algorithm.