Search code examples
pythonmatlabfrequency-distribution

Generate values from a frequency distribution


I'm currently analyzing a 16 bit binary string - something like 0010001010110100. I have approximately 30 of these strings. I have written a simple program in Matlab that counts the numbers of 1's in each bit for all 30 strings.

So, for example:

1 30

2 15

3 1

4 10

etc

I want to generate more strings (100s) that roughly follow the frequency distribution above. Is there a Matlab (or Python or R) command that does that?

What I'm looking for is something like this: http://www.prenhall.com/weiss_dswin/html/simulate.htm


Solution

  • In MATLAB: just use < (or lt, less than) on rand:

    len = 16; % string length
    % counts of 1s for each bit (just random integer here)
    counts = randi([0 30],[1 len]); 
    % probability for 1 in each bit
    prob = counts./30;
    % generate 100 random strings 
    n = 100;
    moreStrings = rand(100,len);
    % for each bit check if number is less than the probability of the bit
    moreStrings = bsxfun(@lt, moreStrings, prob); % lt(x,y) := x < y
    

    In Python:

    import numpy as np
    
    len = 16 # string length
    # counts of 1's for each bit (just random integer here)
    counts = np.random.randint(0, 30, (1,16)).astype(float)
    # probability for 1 in each bit
    prob = counts/30
    # generate 100 random strings 
    n = 100
    moreStrings = np.random.rand(100,len)
    # for each bit check if number is less than the probability of the bit
    moreStrings = moreStrings < prob