Is there a way to get a binary number array without converting it to a string first?

I am working on a problem where I need to manipulate binary data. The easiest way for me to do this would be through arrays as the use of binary string representation is not allowed. I'm able to derive this code or do the same thing using bin() but the numbers will always be converted into a string. I need to use an array of type int or bool. How can this be done?

ascii = list("ABCD".encode('ascii'))
arr = list(map(lambda x: [*format(x, '08b')], ascii))

I tried using bin() and format() to get binary strings in python but got string results.

Solution

The usage of strings for binary data is not the best idea since each character in a string typically takes more than one byte.

List comprehension: (@Mark in the comment)

Since you don't seem to have many operations, a list of list would be fine to save the bits:

ascii = list("ABCD".encode('ascii'))
res = [[(byte >> (7 - i)) & 1 for i in range(8)] for byte in ascii]
print(res)

Prints

[[0, 1, 0, 0, 0, 0, 0, 1], 
[0, 1, 0, 0, 0, 0, 1, 0], 
[0, 1, 0, 0, 0, 0, 1, 1], 
[0, 1, 0, 0, 0, 1, 0, 0]]

You could use numpy.array() and save the bits, if you had numerous operations:

import numpy as np

ascii = list("ABCD".encode('ascii'))
res = []

for byte in ascii:
    for i in range(8):
        bit = (byte >> (7 - i)) & 1
        res.append(bit)

print(np.array(res, dtype=bool))

Prints

[False True False False False False False True False True False False False False True False False True False False False False True True False True False False False True False False]

Comments

Isn't Numpy kind of overkill just to convert 1 and 0 to True and False? Why not: res.append(bool(bit)) or even just leave them as ints? by @Mark

Every single element in the Python list is an object.
Numpy arrays use single element type and don't use any type dynamically, like Python list does. This make Numpy arrays more efficient for computationally intensive tasks.
If we would deal with 10 million operations, this could be an overkill. But, since the question relates to bit manipulations, the size of data could be large and number of operations could be high, therefore, Numpy arrays are an efficient choice.

Naked Benchmark


import time, sys, numpy

data = list(range(100000000))
L = list(data)
arr = numpy.array(data)
start = time.time()
list_mult = [x * 2 for x in L]
end = time.time()
print(f"List: {end - start} seconds")

start = time.time()
arr_mult = arr * 2
end = time.time()
print(f"Array: {end - start} seconds")

Prints

List: 12.309393882751465 seconds 
Array: 2.8275811672210693 seconds

Note

`(byte >> (7 - i)) & 1`:

i = 0: (65 >> (7 - 0)) & 1 → (65 >> 7) & 1 → 00000000 & 1 → 0
i = 1: (65 >> (7 - 1)) & 1 → (65 >> 6) & 1 → 00000001 & 1 → 0
i = 2: (65 >> (7 - 2)) & 1 → (65 >> 5) & 1 → 00000010 & 1 → 1
i = 3: (65 >> (7 - 3)) & 1 → (65 >> 4) & 1 → 00000100 & 1 → 0
i = 4: (65 >> (7 - 4)) & 1 → (65 >> 3) & 1 → 00001000 & 1 → 0
i = 5: (65 >> (7 - 5)) & 1 → (65 >> 2) & 1 → 00010000 & 1 → 0
i = 6: (65 >> (7 - 6)) & 1 → (65 >> 1) & 1 → 00100000 & 1 → 0
i = 7: (65 >> (7 - 7)) & 1 → (65 >> 0) & 1 → 01000001 & 1 → 1

[False  True False False False False False  True  # 'A' -> 01000001
 False  True  True False False False False False  # 'B' -> 01000010
 False  True  True  True False False False False  # 'C' -> 01000011
 False  True  True  True False False  True False] # 'D' -> 01000100

Note that the bitwise & between any bit and 1 returns the bit itself, which effectively filters out all the other bits:

0 & 1 → 0
1 & 1 → 1

Is there a way to get a binary number array without converting it to a string first?

List comprehension: (@Mark in the comment)

Prints

Prints

Comments

Naked Benchmark

Prints

Note

(byte >> (7 - i)) & 1:

`(byte >> (7 - i)) & 1`: