I always have the hardest time with this and I have yet to see a definitive explanation for something that is supposedly so common and highly-used.
We already know the standard binary search. Given starting lower and upper bounds, find the middle point at (lower + higher)/2, and then compare it against your array, and then re-set the bounds accordingly, etc.
However what are the needed differences to adjust the search to find (for a list in ascending order):
It seems like each of these cases requires very small tweaks to the algorithm but I can never get them to work right. I try changing inequalities, return conditions, I change how the bounds are updated, but nothing seems consistent.
What are the definitive ways to handle these four cases?
Binary search(at least the way I implement it) relies on a simple property - a predicate holds true for one end of the interval and does not hold true for the other end. I always consider my interval to be closed at one end and opened at the other. So let's take a look at this code snippet:
int beg = 0; // pred(beg) should hold true
int end = n;// length of an array or a value that is guranteed to be out of the interval that we are interested in
while (end - beg > 1) {
int mid = (end + beg) / 2;
if (pred(a[mid])) {
beg = mid;
} else {
end = mid;
}
}
// answer is at a[beg]
This will work for any of the comparisons you define. Simply replace pred
with <=target
or >=target
or <target
or >target
.
After the cycle exits, a[beg]
will be the last element for which the given inequality holds.
So let's assume(like suggested in the comments) that we want to find the largest number for which a[i] <= target
. Then if we use predicate a[i] <= target
the code will look like:
int beg = 0; // pred(beg) should hold true
int end = n;// length of an array or a value that is guranteed to be out of the interval that we are interested in
while (end - beg > 1) {
int mid = (end + beg) / 2;
if (a[mid] <= target) {
beg = mid;
} else {
end = mid;
}
}
And after the cycle exits, the index that you are searching for will be beg
.
Also depending on the comparison you may have to start from the right end of the array. E.g. if you are searching for the largest value >= target, you will do something of the sort of:
beg = -1;
end = n - 1;
while (end - beg > 1) {
int mid = (end + beg) / 2;
if (a[mid] >= target) {
end = mid;
} else {
beg = mid;
}
}
And the value that you are searching for will be with index end
. Note that in this case I consider the interval (beg, end]
and thus I've slightly modified the starting interval.