python numpy optimization numeric numerical-methods

Fast way to find all vectors similar to a given one

By "similar vector" I define a vector that differs from a given one at just one position by either -1 or 1. However, if the element of the given one is zero, only difference by 1 is possible. Examples:

similar_vectors(np.array([0,0,0]))

array([[ 1.,  0.,  0.],
       [ 0.,  1.,  0.],
       [ 0.,  0.,  1.]])


similar_vectors(np.array([1,0,2,3,0,0,1]))

array([[ 0.,  0.,  2.,  3.,  0.,  0.,  1.],
       [ 2.,  0.,  2.,  3.,  0.,  0.,  1.],
       [ 1.,  1.,  2.,  3.,  0.,  0.,  1.],
       [ 1.,  0.,  1.,  3.,  0.,  0.,  1.],
       [ 1.,  0.,  3.,  3.,  0.,  0.,  1.],
       [ 1.,  0.,  2.,  2.,  0.,  0.,  1.],
       [ 1.,  0.,  2.,  4.,  0.,  0.,  1.],
       [ 1.,  0.,  2.,  3.,  1.,  0.,  1.],
       [ 1.,  0.,  2.,  3.,  0.,  1.,  1.],
       [ 1.,  0.,  2.,  3.,  0.,  0.,  0.],
       [ 1.,  0.,  2.,  3.,  0.,  0.,  2.]])

I would like to obtain maximally fast implementation of similar_vectors(vector) function mentioned above. It is run in my simulation millions of times for vector of length ~few dozens, so speed is crucial. I am interested both in pure numpy solutions as well as some wrappers to other languages. The code can be parallel.

My current implementation is the following:

def singleOne(length,positionOne): #generates a vector of len length filled with zeros apart from a single one at positionOne
    arr=np.zeros(length)
    arr[positionOne]=1
    return arr

def similar_vectors(state):
    connected=[]
    for i in range(len(state)):
        if(state[i]!=0):
            connected.append(state-singleOne(state.shape[0],i))       
        connected.append(state+singleOne(state.shape[0],i))
    return np.array(connected)

This is terribly slow due to the for loop that I can't easily get rid of.

For reference I'm attaching the profile from 1000 executions of similar_vectors(vector):

         37003 function calls in 0.070 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.070    0.070 {built-in method builtins.exec}
        1    0.003    0.003    0.070    0.070 <string>:2(<module>)
     1000    0.035    0.000    0.064    0.000 <ipython-input-19-69431411f902>:6(similar_vectors)
    11000    0.007    0.000    0.021    0.000 <ipython-input-19-69431411f902>:1(singleOne)
    11000    0.014    0.000    0.014    0.000 {built-in method numpy.core.multiarray.zeros}
     2000    0.009    0.000    0.009    0.000 {built-in method numpy.core.multiarray.array}
    11000    0.002    0.000    0.002    0.000 {method 'append' of 'list' objects}
     1000    0.000    0.000    0.000    0.000 {built-in method builtins.len}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

Solution

Approach #1

Here's a masking based one -

def similar_vectors_masking(a):
    n = len(a)
    m = n*2+1
    ar = np.repeat(a[None],len(a)*2,0)
    ar.ravel()[::m] -= 1
    ar.ravel()[n::m] += 1
    mask = np.ones(len(ar),dtype=bool)
    mask[::2] = a!=0
    out = ar[mask]
    return out

Sample runs -

In [142]: similar_vectors_masking(np.array([0,0,0]))
Out[142]: 
array([[1, 0, 0],
       [0, 1, 0],
       [0, 0, 1]])

In [143]: similar_vectors_masking(np.array([1,0,2,3,0,0,1]))
Out[143]: 
array([[0, 0, 2, 3, 0, 0, 1],
       [2, 0, 2, 3, 0, 0, 1],
       [1, 1, 2, 3, 0, 0, 1],
       [1, 0, 1, 3, 0, 0, 1],
       [1, 0, 3, 3, 0, 0, 1],
       [1, 0, 2, 2, 0, 0, 1],
       [1, 0, 2, 4, 0, 0, 1],
       [1, 0, 2, 3, 1, 0, 1],
       [1, 0, 2, 3, 0, 1, 1],
       [1, 0, 2, 3, 0, 0, 0],
       [1, 0, 2, 3, 0, 0, 2]])

Approach #2

For zeros-major-filled array, we might be better off with replication to the exact output size using np.repeat and using a bit of masking again to increment and decrement 1s, like so -

def similar_vectors_repeat(a):
    mask = a!=0
    ra = np.arange(len(a))
    r = mask+1
    n = r.sum()
    ar = np.repeat(a[None],n,axis=0)
    add_idx = r.cumsum()-1
    ar[add_idx,ra] += 1
    ar[(add_idx-1)[mask],ra[mask]] -= 1
    return ar

Benchmarking

Timings with all posted approaches on a large array -

In [414]: # Setup input array with ~80% zeros
     ...: np.random.seed(0)
     ...: a = np.random.randint(1,5,(5000))
     ...: a[np.random.choice(range(len(a)),int(len(a)*0.8),replace=0)] = 0

In [415]: %timeit similar_vectors(a) # Original soln
     ...: %timeit similar_vectors_flatnonzero_concat(a) # @yatu's soln
     ...: %timeit similar_vectors_v2(a) # @Brenlla's soln
     ...: %timeit similar_vectors_masking(a)
     ...: %timeit similar_vectors_repeat(a)
1 loop, best of 3: 195 ms per loop
1 loop, best of 3: 234 ms per loop
1 loop, best of 3: 231 ms per loop
1 loop, best of 3: 238 ms per loop
10 loops, best of 3: 82.8 ms per loop