Search code examples
pythonbinarybit-manipulationbitwise-operators

Getting the position of 1-bits in a python Long object


Let's say I have a very very large python integer, in python 2.7 (though if I need to, I don't mind switching to python 3).

Something bigger than say, 2^100000.

What is the fastest way by which I can find the positions of all 1s in it's binary sequence? (example: 24 would be 11000 ---> = [4,5] (or [5,4].. I don't care about order)

At the moment I am using:

sum = whatever_starting_number

while 1:
    val = sum.bit_length()-1
    sum -= 2**val
    mylist.append(val)
    if sum == 0:
        break

this is alright, but it's barely even faster than just taking log2 and subtracting that repeatedly. What I would like to actually do is just look at the bits, skip zeros, record positions of 1s, and not even need to modify the original value.

edit: getting multiple answers, really appreciate it. I'll implement them in a couple timeit tests and this will be updated with results by tomorrow.


Solution

  • Probably not the fastest solution, but fairly simple and seems fast enough (2^1M was instant).

    bits = []
    for i, c in enumerate(bin(2**1000000)[:1:-1], 1):
        if c == '1':
            bits.append(i)
    

    Just in case the [:1:-1] wasn't clear, it is called "extended slicing", more info here: https://docs.python.org/2/whatsnew/2.3.html#extended-slices.

    Edit: I decided to time the 3 solutions posted in answers, here are the results:

    import timeit
    
    
    def fcn1():
        sum = 3**100000
        one_bit_indexes = []
        index = 0
        while sum: # returns true if sum is non-zero
            if sum & 1: # returns true if right-most bit is 1
                one_bit_indexes.append(index)
            sum >>= 1 # discard the right-most bit
            index += 1
        return one_bit_indexes
    
    
    def fcn2():
        number = 3**100000
        bits = []
        for i, c in enumerate(bin(number)[:1:-1], 1):
            if c == '1':
                bits.append(i)
        return bits
    
    
    def fcn3():
        sum = 3**100000
        return [i for i in range(sum.bit_length()) if sum & (1<<i)]
    
    
    print(timeit.timeit(fcn1, number=1))
    print(timeit.timeit(fcn2, number=1))
    print(timeit.timeit(fcn3, number=1))
    

    For 3^100k:
    fcn1: 0.7462488659657538
    fcn2: 0.02108444197801873
    fcn3: 0.40482770901871845

    For 3^1M:
    fcn1: 70.9139410170028
    fcn2: 0.20711017202120274
    fcn3: 43.36111917096423