Search code examples
javaarraysalgorithmbinary-searchlogarithm

How is maximum key compare in binary search (1+log(n)) not log(n)?


I was re going through a more versatile explanation of BS. Which has a preposition that maximum number of key compare in a binary search is (1+log(n)). I tried to form an intuition as to how is that possible. I understand the ruining time of BS is (log(n)). Also i speculate that the worse time/maximum key lookups will happen in scenario when we search for a element which is not present in the array. But each time i go over a hypothetical array doing a BS i end up with (log(n)) compares/steps never more than that. Here is the code which was used to form that preposition.

public static int binarySearch(int[] a, int key){
     int lo = 0, hi = a.length-1;
     while (lo <= hi){
         int mid = lo + (hi - lo) / 2;

         if (key < a[mid]) 
            hi = mid - 1;
         else if (key > a[mid]) 
            lo = mid + 1;
         else 
            return mid;
     }
     return -1;
 }

Probably my speculation is wrong or i am missing some point. If you could explain how maximum possible key compares would be (1+log(n)) it would be great thanks.


Solution

  • Here's a different way to think about what you're doing. Rather than thinking of binary search as searching over the elements of the array, think of binary search as searching over the separators between the elements in the array. Specifically, imagine numbering the array like this:

       +-----+-----+-----+-----+-----+
       |  0  |  1  |  2  | ... | n-1 |
       +-----+-----+-----+-----+-----+
    

    Now, number the separators:

       +-----+-----+-----+-----+-----+
       |  0  |  1  |  2  | ... | n-1 |
       +-----+-----+-----+-----+-----+
       0     1     2     3 .. n-1    n
    

    Notice that there are n+1 total separators, one before each element and one after the very last element.

    Whenever you do a binary search, you're probing the index of the middle separator (do you see why?) and throwing half of the separators away. You can only throw half of a collection of k items away log2 k times before you're down to a single remaining element. This means that the number of probes needed will be ⌈log2 (n+1)⌉, and it happens to be the case that

    log2 n < ⌈log2 (n+1)⌉ ≤ log2 n + 1,

    so the "1 + log n" bit ends up arising more from "throw away half the separators" than from other sources.

    Hope this helps!