Search code examples
pythonalgorithmrandom

Generating unique random numbers from non-unique random numbers


The following function converts a list of non-unique random numbers into a list of unique random numbers using a helper list of sequential integers:

def get_d(rlist):
    # t[] is a list of ints 0..n-1 where n is big
    # enough not to cause issues when deleting items
    t  = list(range( len(rlist) + max(rlist) ))
    d  = []            # the output list we will populate
    for r in rlist:
        d.append(t[r]) # add r'th remaining item in t[] to d[]
        t.pop(r)       # delete that item from t[]
    return d

e.g.
get_d([3, 2, 3, 1, 5]) = [3, 2, 5, 1, 9]

The reason for favouring this way of doing things is personal preference. It strikes me as a very natural ordering and it seems like one of the most obvious ways of converting a list of integers into another while guaranteeing uniqueness.

I want to remove the dependance on the temporary list so that very large items can be accomodated without resorting to a very large list. I'm open to other ways of ordereding the output but I'm most interested in an answer to getting the same output using a counting or other method.

This is my attempt, which has some edge case or perhaps more fundamental issues:

def get_e(r):
    e = []
    for i,t in enumerate(r):
        c = 0
        # track numbers <= the current one
        for j in range(i):
            if r[j] <= t:
                c+=1
        # track numbers <= the current one when virtual deletions are accounted for
        for j in range(i):
            if r[j] > t and r[j]-c <= t:
                c+=1
        e.append(t+c)
    return e

Here is a test:

for r in [ [3,2,3,1,5], [0,0,1,0,0]]:
    d = get_d(r)
    e = get_e(r)
    print('match:   ' if d==e else 'mismatch:', r, '  : ',  d, '  ', e)


Output: 
match:    [3, 2, 3, 1, 5]   :  [3, 2, 5, 1, 9]    [3, 2, 5, 1, 9]
mismatch: [0, 0, 1, 0, 0]   :  [0, 1, 3, 2, 4]    [0, 1, 3, 3, 4]

Solution

  • I came up with the following, which seems to do the trick, passing the test I showed above. It's much more involved than I expected and there's certainly room for optimisation.

    def splitlist(t,i):
        c = t[0]+i
        a = [t[0],c-1]
        b = [c+1, t[1]]
        if c==0:
            return [[b],c]
        if c==t[1]:
            return [[a],c]
        return [[a,b],c]
    
    def get_f(x):
        u = [[ 0, max(x) + len(x) ]]
        print(f'u: {u}')
        v = []
        for k in x:
            i = k
            for j,t in enumerate(u):
                L =  t[1]-t[0]+1 # "len(t)"
                if i < L:
                    x,y = splitlist(t,i)
                    break
                else:
                    i-=L
            v.append(y)
            del u[j]
            if len(x) > 1:
                u.insert(j, x[1])
            u.insert(j, x[0])
            print(f'{k}->{y} : {u}')
        return v
    

    The idea is to start off with something like this as input:

    x = [ 5, 8, 0, 20, 15 ]
    

    and make use of a helper list of lists, initialised as:

    u = [ [0,25] ]  
    

    then keep finding and splitting the appropriate list within u as follows:

    >>> get_f([5, 8, 0, 20, 15 ])
    u: [[0, 25]]
    5->5 : [[0, 4], [6, 25]]
    8->9 : [[0, 4], [6, 8], [10, 25]]
    0->0 : [[1, 4], [6, 8], [10, 25]]
    20->23 : [[1, 4], [6, 8], [10, 22], [24, 25]]
    15->18 : [[1, 4], [6, 8], [10, 17], [19, 22], [24, 25]]
    [5, 9, 0, 23, 18]
    

    which matches the original:

    >>> get_d([5, 8, 0, 20, 15 ])
        [5, 9, 0, 23, 18]