Search code examples
pythonrandomlcgrandom-testing

Dieharder random test suite - suspiciously good results


I generated a txt file based on the following generator (2500000 numbers)

import numpy as np

class LCG(object):

    UZERO: np.uint32 = np.uint32(0)
    UONE : np.uint32 = np.uint32(1)

    def __init__(self, seed: np.uint32, a: np.uint32, c: np.uint32) -> None:
        self._seed: np.uint32 = np.uint32(seed)
        self._a   : np.uint32 = np.uint32(a)
        self._c   : np.uint32 = np.uint32(c)

    def next(self) -> np.uint32:
        self._seed = self._a * self._seed + self._c
        return self._seed

    def seed(self) -> np.uint32:
        return self._seed

    def set_seed(self, seed: np.uint32) -> np.uint32:
        self._seed = seed

    def skip(self, ns: np.int32) -> None:
        """
        Signed argument - skip forward as well as backward

        The algorithm here to determine the parameters used to skip ahead is
        described in the paper F. Brown, "Random Number Generation with Arbitrary Stride,"
        Trans. Am. Nucl. Soc. (Nov. 1994). This algorithm is able to skip ahead in
        O(log2(N)) operations instead of O(N). It computes parameters
        A and C which can then be used to find x_N = A*x_0 + C mod 2^M.
        """

        nskip: np.uint32 = np.uint32(ns)

        a: np.uint32 = self._a
        c: np.uint32 = self._c

        a_next: np.uint32 = LCG.UONE
        c_next: np.uint32 = LCG.UZERO

        while nskip > LCG.UZERO:
            if (nskip & LCG.UONE) != LCG.UZERO:
                a_next = a_next * a
                c_next = c_next * a + c

            c = (a + LCG.UONE) * c
            a = a * a

            nskip = nskip >> LCG.UONE

        self._seed = a_next * self._seed + c_next


#%%
np.seterr(over='ignore')

a = np.uint32(1664525)
c = np.uint32(1013904223)
seed = np.uint32(1)

rng = LCG(seed, a, c)
q = [rng.next() for _ in range(0, 2500000)]

I saved the file using this code:

First cell

%%capture cap --no-stderr
print(q)

Second cell

with open('output5.txt', 'w') as f:
    f.write(cap.stdout)

Then I used the Diehard suite to carry out the tests in the following way:

dieharder -f output5.txt -a

I'm not sure if the tests are actually running on my txt file and whether my txt file is right. The sample of 2.5 million numbers is about 30mb.

I'm surprised that all tests are going well.

Below is the result in the terminal.

I'm confused because the name is MT19937 - this is not my name and the file is "output5.txt" is my file. I do not know if the tests are performed on my file

enter image description here


Solution

  • I generate 2.5 million lines with the generator save them to file testrands.txt with this header:

    #==================================================================
    # generator lcg  seed = 1
    #==================================================================
    type: d
    count: 100000
    numbit: 32
    1015568748
    1586005467
    2165703038
    3027450565
    217083232
    1587069247
    ......
    

    I called with:

    dieharder -g 202 -f testrands.txt -a
    

    now the results are surprisingly weak (maybe I have generated too few numbers?)

    I'm also not sure if all of these tests are suitable for LCG testing but the result is surprisingly weak

    1

    2

    I do as it is in the guide but still seems to be something not as it should - LCG passes a birthday-spacing (I think it should not) and the remaining results are surprisingly weak