c++algorithm image-processing grid breadth-first-search

Grid nearest neighbour BFS slow

Im trying to upsample my image. I fill the upsampled version with corresponding pixels in this way.
pseudocode: upsampled.getPixel(((int)(x * factorX), (int)(y * factorY))) = old.getPixel(x, y)
as a result i end up with the bitmap that is not completely filled and I try to fill each not filled pixel with it's nearest filled neighbor.
I use this method for nn search and call it for each unfilled pixel. I do not flag unfilled pixel as filled after changing its value as it may create some weird patterns. The problem is that - it works but very slow. Execution time on my i7 9700k for 2500 x 3000 img scaled by factor x = 1,5 and y = 1,5 takes about 10 seconds.

template<typename T>
std::pair<int, int> cn::Utils::nearestNeighbour(const Bitmap <T> &bitmap, const std::pair<int, int> &point, int channel, const bool *filledArr) {
auto belongs = [](const cn::Bitmap<T> &bitmap, const std::pair<int, int> &point){
    return point.first >= 0 && point.first < bitmap.w && point.second >= 0 && point.second < bitmap.h;
};
if(!(belongs(bitmap, point))){
    throw std::out_of_range("This point does not belong to bitmap!");
}
auto hash = [](std::pair<int, int> const &pair){
    std::size_t h1 = std::hash<int>()(pair.first);
    std::size_t h2 = std::hash<int>()(pair.second);
    return h1 ^ h2;
};

//from where, point
std::queue<std::pair<int, int>> queue;
queue.push(point);

std::unordered_set<std::pair<int, int>, decltype(hash)> visited(10, hash);

while (!queue.empty()){
    auto p = queue.front();
    queue.pop();
    visited.insert(p);
    if(belongs(bitmap, p)){
        if(filledArr[bitmap.getDataIndex(p.first, p.second, channel)]){
            return {p.first, p.second};
        }
        std::vector<std::pair<int,int>> neighbors(4);
        neighbors[0] = {p.first - 1, p.second};
        neighbors[1] = {p.first + 1, p.second};
        neighbors[2] = {p.first, p.second - 1};
        neighbors[3] = {p.first, p.second + 1};

        for(auto n : neighbors) {
            if (visited.find(n) == visited.end()) {
                queue.push(n);
            }
        }
    }
}
return std::pair<int, int>({-1, -1});
}

the bitmap.getDataIndex() works in O(1) time. Here's its implementation

template<typename T>
int cn::Bitmap<T>::getDataIndex(int col, int row, int depth) const{
    if(col >= this->w or col < 0 or row >= this->h or row < 0 or depth >= this->d or depth < 0){
        throw std::invalid_argument("cell does not belong to bitmap!");
    }
    return depth * w * h + row * w + col;
}

I have spent a while on debugging this but could not really find what makes it so slow.
Theoretically when scaling by factor x = 1,5, y = 1,5, the filled pixel should be no further than 2 pixels from unfilled one, so well implemented BFS wouldn't take long.

Also i use such encoding for bitmap, example for 3x3x3 image

     * (each row and channel is in ascending order)

     * {00, 01, 02}, | {09, 10, 11}, | {18, 19, 20},
    c0 {03, 04, 05}, c1{12, 13, 14}, c2{21, 22, 23},
     * {06, 07, 08}, | {15, 16, 17}, | {24, 25, 26},

Solution

the filled pixel should be no further than 2 pixels from unfilled one, so well implemented BFS wouldn't take long.

Sure, doing it once won’t take long. But you need to do this for almost every pixel in the output image, and doing lots of times something that doesn’t take long will still take long.

Instead of searching for a set pixel, use the information you have about the earlier computation to directly find the values you are looking for.

For example, in your output image, and set pixel, is at ((int)(x * factorX), (int)(y * factorY)), for integer x and y. So for a non-set pixel (a, b), you can find the nearest set pixel by ((int)(round(a/factorX)*factorX), (int)(round(b/factorY)*factorY)).

However, you are much better off directly upsampling the image in a simpler way: don’t loop over the input pixels, instead loop over the output pixels, and find the corresponding input pixel.