Search code examples
pythonnumpyprobabilitydistribution

Generate a matrix of transition probabilities for bit strings of given size following some probability distribution


I want to create an 8x8 matrix which provides the error probabilities in bit communication. The matrix looks as follows:enter image description here

The columns amount to the observed quantities and the rows to the measured quantities. An element p[i,j] amounts to the conditional probability p(j|i). For example, the element p[0,1] gives the probability to observe the string 001 when the actual value is 000, i.e., it measures p(001|000).

Question: How can I create such a matrix in Python such that

  1. The more bit flips there are the smaller is the equivalent conditional probability (for example p(100|000)<p(110|000)?
  2. How to enable an "asymmetry". I.e., the probability of p(001|000)< p(000|001). That is, having bias that favors with higher probabilities transitions 1 to 0 than transitions 0 to 1.

Of course, the sum of probabilities in each row must equal to 1.

All in all, I want to create function in Python that takes as input an integer n (the size of the matrix, or equivalently where 2^n is the length of the bit string) and outputs a probability transition matrix with the above specified rules.

The difficulty is how to implement a probability distribution to fill the cells.

It is trivial to create an 8x8 array and fill diagonals:

P = np.zeros((8,8))
for i in range(8):
    for j in range(8):
        if i==j:
            P[i,j]=1

Similarly, its trivial to fill a given row or a given column by a fixed number. However, I cannot figure out (how to even begin) to fill such a matrix following the rules above, or even how exactly to define the distribution the elements must follow.


Solution

  • It turns out you can do this simply without numpy or scipy. I use pandas for nice printing.

    The logic is that for each bit, you have a probability of flipping (p01 or p10) or remaining the same (p00 or p11). Transforming one bit string to another requires multiplying the appropriate probability for each of n bits.

    For example: P(010|001) = P(0->0) * P(1->0) * P(0->1) = p00 * p10 * p01

    This process is repeated for every sent and observed combination.

    You can further reduce the two level if statement below to one line using nested ternary assignment, but I think this is a nice balance of being concise and readable:

    import pandas as pd
    
    
    def p(sent, observed, p01, p10):
        """Return the probability of 'sent' being received as 'observed'
        given p01 (the probability a bit flips from a 0->1) and p10 (the
        probability a bit flips from 1->0).
        """
        p00 = 1 - p01
        p11 = 1 - p10
        r = 1
        for i, _ in enumerate(sent):
            if sent[i] == "0":
                r *= p00 if observed[i] == "0" else p01
            else:
                r *= p10 if observed[i] == "0" else p11
        return r
    
    def generate_error_matrix(n, p01, p10):
        """Print a matrix of the transitions of all permutations of bit
        errors for a given bit length.
    
        Parameters:
            n - the number of bits
            p01 - probability of a bit flipping from 0 to 1
            p10 - probability of a bit flipping from 1 to 0
        """
        labels = [f"{i:0{n}b}" for i in range(0, 2**n)]
        result = pd.DataFrame(index=labels, columns=labels)
        for rowIndex, row in result.iterrows():
            for columnIndex, _ in row.items():
                result.at[rowIndex, columnIndex] = p(rowIndex, columnIndex, p01, p10)
        return result
    

    Here's an example:

    print(generate_error_matrix(n=3, p01=0.2, p10=0.1))
    
           000    001    010    011    100    101    110    111
    000  0.512  0.128  0.128  0.032  0.128  0.032  0.032  0.008
    001  0.064  0.576  0.016  0.144  0.016  0.144  0.004  0.036
    010  0.064  0.016  0.576  0.144  0.016  0.004  0.144  0.036
    011  0.008  0.072  0.072  0.648  0.002  0.018  0.018  0.162
    100  0.064  0.016  0.016  0.004  0.576  0.144  0.144  0.036
    101  0.008  0.072  0.002  0.018  0.072  0.648  0.018  0.162
    110  0.008  0.002  0.072  0.018  0.072  0.018  0.648  0.162
    111  0.001  0.009  0.009  0.081  0.009  0.081  0.081  0.729
    

    And some edge cases:

    Zeroes always flip to ones, ones never flip to zeroes:

    print(generate_error_matrix(n=3, p01=1, p10=0))
    
        000 001 010 011 100 101 110 111
    000   0   0   0   0   0   0   0   1
    001   0   0   0   0   0   0   0   1
    010   0   0   0   0   0   0   0   1
    011   0   0   0   0   0   0   0   1
    100   0   0   0   0   0   0   0   1
    101   0   0   0   0   0   0   0   1
    110   0   0   0   0   0   0   0   1
    111   0   0   0   0   0   0   0   1
    

    Ones always flip to zeroes, zeroes never flip to ones:

    print(generate_error_matrix(n=3, p01=0, p10=1))
    
        000 001 010 011 100 101 110 111
    000   1   0   0   0   0   0   0   0
    001   1   0   0   0   0   0   0   0
    010   1   0   0   0   0   0   0   0
    011   1   0   0   0   0   0   0   0
    100   1   0   0   0   0   0   0   0
    101   1   0   0   0   0   0   0   0
    110   1   0   0   0   0   0   0   0
    111   1   0   0   0   0   0   0   0
    

    Bits always flip:

    print(generate_error_matrix(n=3, p01=1, p10=1))
    
        000 001 010 011 100 101 110 111
    000   0   0   0   0   0   0   0   1
    001   0   0   0   0   0   0   1   0
    010   0   0   0   0   0   1   0   0
    011   0   0   0   0   1   0   0   0
    100   0   0   0   1   0   0   0   0
    101   0   0   1   0   0   0   0   0
    110   0   1   0   0   0   0   0   0
    111   1   0   0   0   0   0   0   0
    

    Every bit has a 50% chance of flipping, regardless of direction:

    print(generate_error_matrix(n=3, p01=0.5, p10=0.5))
    
           000    001    010    011    100    101    110    111
    000  0.125  0.125  0.125  0.125  0.125  0.125  0.125  0.125
    001  0.125  0.125  0.125  0.125  0.125  0.125  0.125  0.125
    010  0.125  0.125  0.125  0.125  0.125  0.125  0.125  0.125
    011  0.125  0.125  0.125  0.125  0.125  0.125  0.125  0.125
    100  0.125  0.125  0.125  0.125  0.125  0.125  0.125  0.125
    101  0.125  0.125  0.125  0.125  0.125  0.125  0.125  0.125
    110  0.125  0.125  0.125  0.125  0.125  0.125  0.125  0.125
    111  0.125  0.125  0.125  0.125  0.125  0.125  0.125  0.125