I am working on a problem where I need to manipulate binary data. The easiest way for me to do this would be through arrays as the use of binary string representation is not allowed. I'm able to derive this code or do the same thing using bin() but the numbers will always be converted into a string. I need to use an array of type int or bool. How can this be done?
ascii = list("ABCD".encode('ascii'))
arr = list(map(lambda x: [*format(x, '08b')], ascii))
I tried using bin() and format() to get binary strings in python but got string results.
ascii = list("ABCD".encode('ascii'))
res = [[(byte >> (7 - i)) & 1 for i in range(8)] for byte in ascii]
print(res)
[[0, 1, 0, 0, 0, 0, 0, 1],
[0, 1, 0, 0, 0, 0, 1, 0],
[0, 1, 0, 0, 0, 0, 1, 1],
[0, 1, 0, 0, 0, 1, 0, 0]]
numpy.array()
and save the bits, if you had numerous operations:import numpy as np
ascii = list("ABCD".encode('ascii'))
res = []
for byte in ascii:
for i in range(8):
bit = (byte >> (7 - i)) & 1
res.append(bit)
print(np.array(res, dtype=bool))
[False True False False False False False True False True False False False False True False False True False False False False True True False True False False False True False False]
Isn't Numpy kind of overkill just to convert 1 and 0 to True and False? Why not: res.append(bool(bit)) or even just leave them as ints? by @Mark
Every single element in the Python list is an object.
Numpy arrays use single element type and don't use any type dynamically, like Python list does. This make Numpy arrays more efficient for computationally intensive tasks.
If we would deal with 10 million operations, this could be an overkill. But, since the question relates to bit manipulations, the size of data could be large and number of operations could be high, therefore, Numpy arrays are an efficient choice.
import time, sys, numpy
data = list(range(100000000))
L = list(data)
arr = numpy.array(data)
start = time.time()
list_mult = [x * 2 for x in L]
end = time.time()
print(f"List: {end - start} seconds")
start = time.time()
arr_mult = arr * 2
end = time.time()
print(f"Array: {end - start} seconds")
List: 12.309393882751465 seconds
Array: 2.8275811672210693 seconds
(byte >> (7 - i)) & 1
:i = 0: (65 >> (7 - 0)) & 1 → (65 >> 7) & 1 → 00000000 & 1 → 0
i = 1: (65 >> (7 - 1)) & 1 → (65 >> 6) & 1 → 00000001 & 1 → 0
i = 2: (65 >> (7 - 2)) & 1 → (65 >> 5) & 1 → 00000010 & 1 → 1
i = 3: (65 >> (7 - 3)) & 1 → (65 >> 4) & 1 → 00000100 & 1 → 0
i = 4: (65 >> (7 - 4)) & 1 → (65 >> 3) & 1 → 00001000 & 1 → 0
i = 5: (65 >> (7 - 5)) & 1 → (65 >> 2) & 1 → 00010000 & 1 → 0
i = 6: (65 >> (7 - 6)) & 1 → (65 >> 1) & 1 → 00100000 & 1 → 0
i = 7: (65 >> (7 - 7)) & 1 → (65 >> 0) & 1 → 01000001 & 1 → 1
[False True False False False False False True # 'A' -> 01000001
False True True False False False False False # 'B' -> 01000010
False True True True False False False False # 'C' -> 01000011
False True True True False False True False] # 'D' -> 01000100
Note that the bitwise &
between any bit and 1 returns the bit itself, which effectively filters out all the other bits:
0 & 1 → 0
1 & 1 → 1