Is there an efficient way to get an array of boolean values that are in the n-th position in bitwise array in Python?
import numpy as np
array = np.array(
[
[1, 0, 1],
[1, 1, 1],
[0, 0, 1],
]
)
pack_array = np.packbits(array, axis=1)
array([0, 1, 0])
I have tried numba with the following function. It returns right results but it is very slow:
import numpy as np
from numba import njit
@njit(nopython=True, fastmath=True)
def getVector(packed, j):
n = packed.shape[0]
res = np.zeros(n, dtype=np.int32)
for i in range(n):
res[i] = bool(packed[i, j//8] & (128>>(j%8)))
return res
How to test it?
import numpy as np
import time
from numba import njit
array = np.random.choice(a=[False, True], size=(100000000,15))
pack_array = np.packbits(array, axis=1)
start = time.time()
array[:,10]
print('np array')
print(time.time()-start)
@njit(nopython=True, fastmath=True)
def getVector(packed, j):
n = packed.shape[0]
res = np.zeros(n, dtype=np.int32)
for i in range(n):
res[i] = bool(packed[i, j//8] & (128>>(j%8)))
return res
# To initialize
getVector(pack_array, 10)
start = time.time()
getVector(pack_array, 10)
print('getVector')
print(time.time()-start)
It returns:
np array
0.00010132789611816406
getVector
0.15648770332336426
Besides some micro-optimisations, I dont believe that there is much that can be optimised here. There are also a few small mistakes in your code:
My updated code (seeing a meagre 40% perfomance increase on my machine):
import numba as nb
import numpy as np
np.random.seed(0)
array = np.random.choice(a=[False, True], size=(10000000,15))
pack_array = np.packbits(array, axis=1)
@nb.njit(locals={'res': nb.boolean[:]})
def getVector(packed, j):
n = packed.shape[0]
res = np.zeros(n, dtype=nb.boolean)
byte = j//8
bit = 128>>(j%8)
for i in range(n):
res[i] = bool(packed[i, byte] & bit)
return res
getVector(pack_array, 10)
In your answer, "res" is a list of 32 bit integers, by giving np.zeros() the numba (NOT numpy) boolean datatype, we can swap it to the more efficient booleans. This is where most of the perfomance improvement comes from. On my machine putting j_mod and j_flr outside of the loop had no noticable effect. But it did have an effect for the commenter @Michael Szczesny, so it might help you aswell.
I would not try to use strides, which @Nick ODell is suggesting, because they can be quite dangerous if used incorrectly. (See the numpy documentation).
edit: I have made a few small changes that were suggested by Michael. (Thanks)