Search code examples
javascriptalgorithmstatisticsprobabilitystatistics-bootstrap

How do I calculate the probability of a presidential candidate winning?


I have an object like

const data = {
    'Washington' : { ElectoralVotes : 12, RChance: 0 },
    'Oregon': { ElectoralVotes: 7, RChance: 15 }, 
    .
    .
    . 
    'Hawaii' : { ElectoralVotes: 4, RChance : 35 }
}

where a key-value pair like

'Washington' : { ElectoralVotes : 12, RChance: 0 }

means "Washinton state has 12 electoral votes and the Republican candidate has a 0% chance of winning the state." From this I'm trying to approximate the chance of the Republican winning.

I realize that there are 2^51 subcollections of states and so the correct method, which involves too much computation for ordinary computers, would be

total = 0;
For each array A in [ [], ['Washington'], ['Oreogon'], ... , ['Washington', 'Oregon', ..., 'Hawaii'] ]
    If (sum of electoral votes of states in A) >= 270
         p = multiply together chances of winning states in A
         total += p;

and then total is the chance that the Republican wins. But since I can't do that, let's say I instead run the procedure over a random collection of 2^10 collections of states. Would I then multiply total by 2^41 to get an approximation of the true value?


Solution

  • The problem with the solution you describe in your question is that there's an exponentially large number of subsets of states to consider. That will make the solution infeasible even when the set of states is relatively small (eg: 50). However, you can solve this using dynamic programming in time O(NS) where N is the total number of electoral votes and S is the number of states.

    Start with an array P of size N+1. Entry i in the array will represent the probability that the Republican will get i electoral votes. It's of size N+1 because the number of votes they can get is 0 to N inclusive.

    Start the array initialized to 0, except the first entry 1. That's describes the probabilities after no states have been included in the calculation: they are sure to get 0 electoral votes if there's no states included yet.

    Now, for a new state (Washington, say), we can update the array to include that state too. Let's say there's k electoral votes, and the probability that our candidate wins there is p.

    Let P2 be the new array of probabilities. If i < k, then:

    P2[i] = P[i] * (p - 1)
    

    And if i >= k, then:

    P2[i] = P[i] * (p - 1) + P[i-k] * p
    

    That is, the probability that the candidate now has i votes is the probability that they already had i votes and they lost Washington, plus the probability that they previously had i-k votes (if that's possible) and they won Washington.

    Once we've included all the states like this, the probability they win the election is the sum of the probabilities of them having i votes where i > N/2.

    In pseudo-code:

    P[] = {1, 0, 0, ..., 0} // size N+1
    for state of all_states {
        P2 = new array of size N+1.
        for i = 0, 1 ... N {
            let p = state.RChance / 100.0
            let k = state.ElectoralVotes
            P2[i] = P[i] * (1 - p)
            if i >= k {
                P2[i] += P[i - k] * p
            }
        }
        P = P2
    }
    win_probability = sum(P[i] for i = floor(N/2)+1 ... N)
    

    In principle one can avoid the P2 array by updating P in place, but it's a little trickier to code (because you have to iterate backwards to avoid changing entries you later need to read). Also in principle the array P can be of size floor(N/2) + 2 with the last element representing the win probability directly. But again, this makes it even fiddlier to code.