Search code examples
javaarraysdata-structuresbinary-search

Equality and in-equality operators used in binary search implemented in Java


I have been trying to figure out why in the binary search code of Java, we have used '<=' rather than simply using a '=='. Is it some sort of optimization?

The following piece of code is from Class: java.util.Arrays, method: binarySearch0()

Code:

    private static int binarySearch0(long[] a, int fromIndex, int toIndex, long key) {
        int low = fromIndex;
        int high = toIndex - 1;

        while(low <= high) {
            int mid = low + high >>> 1;
            long midVal = a[mid];
            if (midVal < key) {
                low = mid + 1;
            } else {
                if (midVal <= key) { // Why here have we used '<=' rather than '=='
                    return mid;
                }

                high = mid - 1;
            }
        }

        return -(low + 1);
    }

Solution

  • Your code differs from the one used within the JDK (http://hg.openjdk.java.net/jdk8u/jdk8u/jdk/file/be44bff34df4/src/share/classes/java/util/Arrays.java#l1825), so I can only assume that you used some kind of decompiler to arrive at your source code.

    The original source code is readable and clearly communicates the intention:

    private static int binarySearch0(long[] a, int fromIndex, int toIndex,
                                     long key) {
        int low = fromIndex;
        int high = toIndex - 1;
    
        while (low <= high) {
            int mid = (low + high) >>> 1;
            long midVal = a[mid];
    
            if (midVal < key)
                low = mid + 1;
            else if (midVal > key)
                high = mid - 1;
            else
                return mid; // key found
        }
        return -(low + 1);  // key not found.
    }
    

    Now if you look at the if statements:

            if (midVal < key)
                low = mid + 1;
            else if (midVal > key)
                high = mid - 1;
            else
                return mid; // key found
    

    you could rewrite it as (still the same code as before):

            if (midVal < key) {
                low = mid + 1;
            } else  {
                if (midVal > key)
                    high = mid - 1;
                else
                   return mid; // key found
            }
    

    Now you can change the comparison in the second if and swap the then and else branches of that statement:

            if (midVal < key) {
                low = mid + 1;
            } else  {
                if (midVal <= key) {
                    return mid; // key found
                }
                high = mid - 1;
            }
    

    This code is functionally equivalent but the intention is no longer visible.