Search code examples
pythontype-conversionbytebit

Convert integer to half-byte, `to_bytes()`?


I have a list of integers:

var = [1, 5, 4, 17, 231, 89]

If I wanted to convert it to a list of bytes instead, I can do:

[(i).to_bytes(1, byteorder='big') for i in var]

Since each value in var is less than 256, it can fit in one byte per integer. But if I had another list, say:

var = [1, 2, 15, 12]

This can fit an integer into half a byte, or more accurate to what I'm looking for, fit two integers per byte.

How can I specify to combine two integers into a byte if possible, both to and from?

Something like below:

var1 = [1, 5, 4, 17, 231, 89]
var2 = [1, 2, 15, 12]

def foo(var):
  if max(var) < 2**4:
    num_bytes = 0.5
  elif max(var) < 2**8:
    num_bytes = 1
  elif max(var) < 2**16:
    num_bytes = 2

  if num_bytes >= 1:
    return [(i).to_bytes(num_bytes, byteorder='big') for i in var], num_bytes
  elif num_bytes = 0.5:
    # convert var to list of nybbles
    # combine two nybbles to a byte
    # create list of bytes (length is half length of var)
    # return it and num_bytes

def unfoo(var, num_bytes):
  if num_bytes >= 1:
    print([int.from_bytes(i, 'big') for i in var])
  elif num_bytes = 0.5:
    # print original list of integers again

I want to convert a list of integers into a list of bytes, but fitting two nybbles to a byte if it can fit, then do the conversion back.

Desired outcome is:

a, b = foo(var1)
unfoo(a, b) # prints [1, 5, 4, 17, 231, 89]

a, b = foo(var2)
unfoo(a, b) # prints [1, 2, 15, 12]

I don't want a list of smallest number of bits to represent a single number. Note the max(list): if all values in the list can be 8-bits, fit it to 8-bits; if all values can be 16-bits, fit it to 16-bits; if all values can be a nybble, then make two nybble pairs into a list of bytes.

Basically if I have two integers that can fit in a nybble each, how do I concatenate both into a single byte? If I know that bytes need to be split into two, how can I do the split? I can always assume the original list will be divisible by 2.


Solution

  • You need to figure out how many numbers can fit in a byte. Then, you need to shift each number by the correct amount and create a new list that contains the combined numbers. Suppose you can fit two numbers in a byte, you'd get new_number = (old_num1 << 4) + old_num2

    def foo(var):
        num_bytes = math.ceil(1 + max(math.log2(x) for x in var)) / 8
        if num_bytes >= 1:
            num_bytes = int(num_bytes)
            return [(i).to_bytes(num_bytes, byteorder='big') for i in var], num_bytes
        elif num_bytes == 0.5:
            shift_bits = 4 # or generally, int(num_bytes * 8) 
            new_list = [(a << shift_bits) + b for a, b in zip(var[::2], var[1::2])]
            return [(i).to_bytes(1, byteorder='big') for i in new_list], num_bytes
    

    To unfoo, you need to do the inverse of this operation: when num_bytes < 1, we can find the number of bits you shifted by in the foo() function. Following the same names as the previous explanation, and given new_number, we can get old_num2 as the least-significant four bits (found by new_number & mask) and old_num1 is the most-significant four bits (found by new_number >> shift_bits)

    def unfoo(var, num_bytes):
        if num_bytes >= 1:
            return [int.from_bytes(i, 'big') for i in var]
        elif num_bytes == 0.5:
            new_list = [int.from_bytes(i, 'big') for i in var]
            ret = []
            shift_bits = 4 # in general: int(num_bytes * 8)
            mask = int(2**shift_bits - 1)
            for i in new_list:
                b = i & mask
                a = (i >> shift_bits) & mask
                ret.append(a)
                ret.append(b)
            return ret
    

    Checking that this works:

    var1 = [1, 5, 4, 17, 231, 89]
    var2 = [1, 2, 15, 12]
            
    a, b = foo(var1)
    c = unfoo(a, b) # prints [1, 5, 4, 17, 231, 89]
    
    print(var1)
    print(c)
    
    
    a2, b2 = foo(var2)
    c2 = unfoo(a2, b2) # prints [1, 2, 15, 12]
    print(var2)
    print(c2)
    

    gives the expected output:

    [1, 5, 4, 17, 231, 89]
    [1, 5, 4, 17, 231, 89]
    [1, 2, 15, 12]
    [1, 2, 15, 12]