Search code examples
javamathprobability

Discrete Probability Distribution in Java


I have a set of integers each of which has a probability assigned, derived from earlier experiments, e.g.:

0 = 0.5
1 = 0.2
2 = 0.3

Complying with the specifications of a probability distribution, these weights sum up to 1.0. I am now looking for an efficient way to sample one of the values while taking the given probabilities into account, e.g. (pseude-code):

Distribution distribution = new DiscreteDistribution(new double[]{0.5, 0.3, 0.2});
distribution.sample();

This should result in 0 half of the time according to the given numbers. However, do not assume any patterns or regularities among these.

I've been using Apache Commons Math for my previous experiments, but it does not seem to provide a solution for this scenario, neither does Colt.

I wonder whether this is because I've missed an easy solution. A naive implemententation seems more or less straight-forward, but doing this efficiently is rather involved. That is why I am looking for an established implementation.


Solution

  • A very simple generic solution would be:

    class Distribution<T>{
        List<Double> probs = new ArrayList<>();
        List<T> events = new ArrayList<>();
        double sumProb;
        Random rand = new Random();
    
        Distribution(Map<T,Double> probs){
            for(T event : probs.keySet()){
                sumProb += probs.get(event);
                events.add(event);
                this.probs.add(probs.get(event));
            }
        }
    
        public T sample(){
            T value;
            double prob = rand.nextDouble()*sumProb;
            int i;
            for(i=0; prob>0; i++){
                prob-= probs.get(i);
            }
            return events.get(i-1);
        }
    }
    

    Feel free to change it, as you need it, e.g. with adding other constructors. Of course here is a lot of stuff to improve, starting with the efficiency, but it is something you can reuse later a lot.