Search code examples
c#arraysbinary-search

How to set mid in binary search?


I was working on this problem 278 First Bad Version on LeetCode. I have used binary search to get the element.

My way of getting middle index of an array m = (start + end)/2 was causing issue in case of large array, closer to MAX_LIMIT of int. First, I thought of int range overflow issue but I am not sure because it worked with end = MAX_LIMIT and some start < MAX_LIMIT even though its going over int range.

I would like to understand how m = start + (end - start)/2 is better than m = (start + end)/2

Code 1 works with input :

2147483647

98765432

But Code 1 fails with input:

2147483647

987654321

I think overflow issue should either happen in both cases or none of them.

1. Code which Worked with m = (start + end)/2 but fails for large array

 public int FirstBadVersion(int n) {
        if(n == 1){
            return 1;
        }
        
        int s = 1;
        int e = n;
        int x = 0;
        while(s != e){
            x = (s+e)/2;
            if(IsBadVersion(x)){
                e = x;
            }
            else{
                s = x + 1;
            }
        }
        
        return s;
    }

2. Code which worked with m = start + (end - start)/2

public int FirstBadVersion(int n) {
        if(n == 1){
            return 1;
        }
        
        int s = 1;
        int e = n;
        int x= 0;
        while(s != e){
            // x = (s+e)/2;
            x = s + (e-s)/2;
            if(IsBadVersion(x)){
                e = x;
            }
            else{
                s =  x + 1;
            }
        }
        
        return e;
    }

Solution

  • You have integer overflow when computing

    (a + b) / 2
    

    when (a + b) > int.MaxValue the sum of (a + b) can't be represented as 32 bit integer (it requires 33 bits), and you have incorrect (often negative) result (when bit 33 is ignored) which you then divide by 2.

    I suggest working with long in order to be safe from integer overflow:

    public int FirstBadVersion(int n) {
      // Special case: all versions starting from #1 failed
      if (IsBadVersion(1))
        return 1;
    
      // let use long in order to avoid integer overflow
      long correct = 1;
      long failed = n;
    
      // while we have unknown version between correct and failed 
      while (failed - correct > 1) {
        int middle = (low + high) / 2); // it safe now : adding two long values
    
        if (IsBadVersion(middle)) 
            failed = middle;
        else
            correct = middle;
      }
    
      return (int)failed;   
    }
    

    If using long is cheating and you want to stick to int, you can use the formula below, for int a, b we can put (note that we don't add big values a and b but their halves)

    (a + b) / 2 == a / 2 + b / 2 + (a % 2 + b % 2) / 2
    

    Please, note that your formula is not universal

    (a + b) = a + (b - a) / 2;
    

    it will work in the particular case of problem #278 where a and b are positive, but will fail in general casw, say when a = 1_000_000_000, b = -2_000_000_000.