Search code examples
numpyredis-py

Selecting rows from ndarray via bytearray


I have a bytearray that is pulled from redis.

r.set('a', '')
r.setbit('a', 0, 1)
r.setbit('a', 1, 1)
r.setbit('a', 12, 1)

a_raw = db.get('a')
# b'\xc0\x08'
a_bin = bin(int.from_bytes(a, byteorder="big")) 
# 0b1100000000001000

I want to use that bytearray to select rows from an ndarray.

arr = np.arange(12)
arr[a_raw]
# array([0, 1, 12])

Edit Both solutions work, but I found @paul-panzer's to be faster

import timeit

setup = '''import numpy as np; a = b'\\xc0\\x08'; '''

t1 = timeit.timeit('idx = np.unpackbits(np.frombuffer(a, np.uint8)); np.where(idx)', 
              setup = setup, number=10000)

t2 = timeit.timeit('idx = np.array(list(bin(int.from_bytes(a, byteorder="big"))[2:])) == "1"; np.where(idx)',
              setup = setup, number=10000)

print(t1, t2)
#0.019560601096600294 0.054518797900527716

Edit 2 Actually, the from_bytes method doesn't return what I'm looking for:

redis_db.delete('timeit_test')
redis_db.setbit('timeit_test', 12666, 1)
redis_db.setbit('timeit_test', 14379, 1)
by = redis_db.get('timeit_test')

idx = np.unpackbits(np.frombuffer(by, np.uint8))
indices = np.where(idx)

idx = np.array(list(bin(int.from_bytes(by, byteorder="big"))[2:])) == "1"
indices_2 = np.where(idx)

print(indices, indices_2)
#(array([12666, 14379]),) (array([   1, 1714]),)

Solution

  • Here is a way using unpackbits:

    >>> a = b'\xc0\x08'
    >>> b = np.arange(32).reshape(16, 2)
    >>> c = np.arange(40).reshape(20, 2)
    >>> 
    >>> idx = np.unpackbits(np.frombuffer(a, np.uint8))
    >>> 
    # if the sizes match boolen indexing can be used
    >>> b[idx.view(bool)]
    array([[ 0,  1],
           [ 2,  3],
           [24, 25]])
    >>> 
    # non matching sizes can be worked around using where
    >>> c[np.where(idx)]
    array([[ 0,  1],
           [ 2,  3],
           [24, 25]])
    >>>