I am having difficulty writing a modified binary search algorithm that returns the smallest number greater than or equal to X which is not present in the sorted array.
For example, if the array is {1,2,3,5,6}
and x = 2
then answer is 4. Please guide me how to write the binary search for this. I have to answer this in O(log n) time for each x. Since I am taking this array as input which will initially take linear time, you may do some kind of preprocessing on the array initially if you want.
x is also taken as input and may or may not be present in the array.
The input array may have repeating elements.
My input numbers can be in the range [0,10^9] and hence first putting all the missing values in the array is not feasible because of space constraints.
Also, you can do preprocessing which takes O(n) time since you are taking the array as input in linear time. After that, there will be let us say 10^6 queries of X, which you have to answer, each in O(log n) time
If I understand correctly, you are allowed to do any kind of preprocessing and only finding the result for different x
must be O(log n)
. If thats the case finding the result after preprocessing isn't a big deal. O(log n)
search algorithms do exist. Good candidates are std::binary_search
or std::lower_bound
.
A very naive approach is to prepare a vector with all missing elements and then std::lower_bound
on that:
#include <iostream>
#include <vector>
#include <algorithm>
int main() {
std::vector<int> input{1,2,3,5,6,10,12};
std::vector<int> missing_elements{4,7,8,9,11};
int x = 2;
auto it = std::lower_bound(missing_elements.begin(),missing_elements.end(),x);
std::cout << *it << "\n";
}
Populating missing_elements
can be done in O(1)
. However, a missing_elements
with a size of the order of 10^9
is of course not feasible. Also this approach is extremely wasteful for input like [1,100000000]
(not in terms of time complexity, but in terms of runtime and memory usage).
An idea put up by Jarod42 in a comment is to prepare a vector of segments and then std::lower_bound
on that. First assuming preprocessing has already been done:
#include <iostream>
#include <vector>
#include <algorithm>
int find_first_missing(const std::vector<std::pair<int,int>>& segments,int x){
std::pair<int,int> p{x,x};
auto it = std::lower_bound(segments.begin(),segments.end(),p,[](auto a,auto b){
return a.second < b.second;
});
if (it == segments.end()) return x;
if (it->first > x) return x;
return it->second+1;
}
int main() {
std::vector<int> input{1,2,3,5,6,10,12};
std::vector<std::pair<int,int>> segments{{1,3},{5,6},{10,10},{12,12}};
for (int x=0; x<13;++x) std::cout << x << " -> " << find_first_missing(segments,x) << "\n";
}
0 -> 0
1 -> 4
2 -> 4
3 -> 4
4 -> 4
5 -> 7
6 -> 7
7 -> 7
8 -> 8
9 -> 9
10 -> 11
11 -> 11
12 -> 13
Because input
is sorted, and segments
is sorted, we can use a custom comparator that only compares the end of the segment. The vector of segments is also sorted with respect to that comparator. The call to lower_bound
returns an iterator to the segment where either x
is inside or x
is lower than the segment, hence if (it->first > x) return x;
otherwise we know that it->second+1
is the next missing number.
Now it is only left to create the vector of segments:
#include <iostream>
#include <vector>
#include <algorithm>
#include <cassert>
std::vector<std::pair<int,int>> segment(const std::vector<int>& input){
std::vector<std::pair<int,int>> result;
if (input.size() == 0) return result;
int current_start = input[0];
for (int i=1;i<input.size();++i){
if (input[i-1] == input[i] || input[i-1]+1 == input[i]) continue;
result.push_back({current_start,input[i-1]});
current_start = input[i];
}
result.push_back({current_start,input.back()});
return result;
}
int main() {
std::vector<int> input{1,2,3,5,6,10,12};
std::vector<std::pair<int,int>> expected{{1,3},{5,6},{10,10},{12,12}};
auto result = segment(input);
for (const auto& e : result){
std::cout << e.first << " " << e.second << "\n";
}
assert(expected == result);
}