Search code examples
arrayscbinary-search

Safety concerns regarding Binary Search


The binary search algorithm is pretty cool. Here's the basic, recursive implementation in C that I see everywhere on the internet:

int binsearch(int arr[], int l, int h, int key) {

    if (l < h) {

        int mid = (l + h) / 2;

        if (key == arr[mid])
            return mid;
        else
        if (key < arr[mid])
            return binsearch(arr, l, mid - 1, key);
        else
            return binsearch(arr, mid + 1, h, key);

    } else {

        return -1; /* if key isn't found, -1 is a sign of failure */

    }
}

/* calling and printing in main */

printf("%d\n", binsearch(array, 0, n - 1, my_key));

Although there may seem nothing wrong with this, when I run this, it doesn't work for marginal elements of the array (i.e. arr[0] and arr[n-1]), returning -1. Here's what worked for me:

#include <stdio.h>

int A[] = { -6, -5, -3, -1, 0, 4, 5, 9, 12, 13 }; /* don't count, n = 10 */
int k;

int binsearch(int s, int e, int n) {

    if (s < e) {

        int mid = (s + e) / 2;
        
        if (n == A[mid]) {

            return mid;

        } else
        if (n < A[mid]) {

            return binsearch(s, mid, n); /* change: mid-1 */

        } else {

            return binsearch(mid, e, n); /* change: mid+1 */

        }

    } else {

        return -1;

    }
}

int main() {

     printf("key: ");
     scanf("%d", &k);

     printf("Place in array: %d\n", binsearch(0, 10, k)); /* change: n-1 to n */
}

My question: are these changes unsafe to memory (for example, is the function accessing memory out of array, as it should be 9, not 10) or inefficient? Is there something I'm missing that wouldn't have required these changes?

Thanks in advance!


Solution

  • Another corner problem in both of OP's functions.

    Overflow

    int mid = (l+h)/2; risks overflow in l+h.

    Given -1 <= l <= h, a non-overflow solution is

    int mid = l + (h-l)/2;
    

    This also works when l, h are some unsigned type.

    Type

    Array indexes may exceed the int range. Use size_t to handle even the largest of arrays.

    Recall size_t is an unsigned type and so the algorithm needs to guard against mid-1 when mid == 0. The classic approach if not not specify the last index, but the last index + 1 and adjust calculations and tests accordingly.


    Alterative

    Some untested code incorporating the above to give OP an idea.

    /*
     * Return a pointer to the matching array element or NULL when not found. 
     * nmemb is the number of elements in the array.  May be 0 to SIZE_MAX.
     */
    int *binsearch_int(int key, const int *base, size_t nmemb) {
      while (nmemb > 0) {
        size_t mid = nmemb/2;
        if(key == base[mid]) {
          return &base[mid];
        }
    
        if(key < base[mid]) {
          nmemb = mid;
        } else {
          base += mid + 1;
          nmemb -= mid + 1;
        }
      }
      return NULL;
    }