Search code examples
pythonencodingprotocol-buffersvarint

Given an integer, how big is its varint encoding?


I have a python list of integers, and I'd like to know how much space it will take up when encoded as a sequence of Protocol Buffers variable-length integers, or varints. What's the best way to figure this out without actually encoding the integers?

my_numbers = [20, 69, 500, 38987982344444, 420, 99, 1, 999]
e = MyCoolVarintArrayEncoder(my_numbers)
print(len(e))  # ???

Solution

  • Each integer is encoded in base 128, one byte per "digit". The length of an integer value's representation in any base is ceil(log(value, base)).

    Take the log(base=128) of each integer; round those values up to the nearest integer; sum those rounded values, and there's your length.