Search code examples
pythonbisectiongit-bisectbisect

Find where f(x) changes in a list, with bisection (in Python)


Reasoning: I'm trying to implement, in Python, something similar to git bisect, but with basically a list of directories.

I have a (long) list of version numbers like this: ['1.0', '1.14', '2.3', '3.1', '4']

I have a function works() which takes a version number, and returns a value.

[works(x) for x in my_list] would look like: ['foo', 'foo', 'foo', 'bar', 'bar'] ... but running works() is very expensive.

I would like to do some kind of bisect which will find the change boundary.


Solution

  • You could simply use binary search:

    def binary_f(f,list):
        frm = 0
        to = len(list)
        while frm < to:
            mid = (frm+to)>>1
            if f(list[mid]):
                to = mid
            else:
                frm = mid+1
        return frm
    

    It will return the first index i for which bool(f(list[i])) is True.

    Of course the function assumes that the the map of f on the list is of the form:

    f(list) == [False,False,...,False,True,True,...,True]
    

    If this is not the case, it will usually find a swap but which one is rather undefined.

    Say f is simply "the version is 2 or higher" so lambda v:v >= '2', then it will return:

    >>> binary_f(lambda v:v >= '2',['1.0', '1.14', '2.3', '3.1', '4'])
    2
    

    So index 2. In case the entire list would return with False objects, it will **return len(list). Since it "assumes" the element just outside the list will be evaluated to True:

    >>> binary_f(lambda v:v >= '4.2',['1.0', '1.14', '2.3', '3.1', '4'])
    5
    

    Of course in your example f is works.

    Experiments:

    >>> binary_f(lambda v:v >= '2',['1.0', '1.14', '2.3', '3.1', '4'])
    2
    >>> binary_f(lambda v:v >= '0',['1.0', '1.14', '2.3', '3.1', '4'])
    0
    >>> binary_f(lambda v:v >= '1',['1.0', '1.14', '2.3', '3.1', '4'])
    0
    >>> binary_f(lambda v:v >= '1.13',['1.0', '1.14', '2.3', '3.1', '4'])
    1
    >>> binary_f(lambda v:v >= '2.4',['1.0', '1.14', '2.3', '3.1', '4'])
    3
    >>> binary_f(lambda v:v >= '3',['1.0', '1.14', '2.3', '3.1', '4'])
    3
    >>> binary_f(lambda v:v >= '3.2',['1.0', '1.14', '2.3', '3.1', '4'])
    4
    >>> binary_f(lambda v:v >= '4.2',['1.0', '1.14', '2.3', '3.1', '4'])
    5
    

    (I here of course did a very cheap version check, but it works of course for more sophisticated predicates).

    Since this is binary search, it will run in O(log n) with n the number of items in the list whereas linear search can result in O(n) checks (which is usually more expensive).

    EDIT: in case the list contains two values and you want to find the swap, you can simply first compute the value for index 0:

    val0 = f(list[0])
    

    and then provide binary_f:

    binary_f(lambda v:works(v) != val0,list)
    

    Or putting it into a nice function:

    def binary_f_val(f,list):
        val0 = f(list[0])
        return binary_f(lambda x:f(x) != val0,list)