Maxpooling is a technique I read about it here https://computersciencewiki.org/index.php/Max-pooling_/_Pooling. I understand, that it is used to approximate the input. Which is to reduce the time a neural network may spend working on it. What I can't pinpoint is, why should it select the max values? is that effective? if so why?. Other options could be like selecting mean, or min, or maybe the top left values(for instance).
We select max of window to take the pixel which is most activated (more activation of a pixel means more information).
There are variations like avg-pooling to take the mean of all pixels of a window, but in practice there is not a lot of difference in the results.
Max-Pooling is effective and fast. Another reason to use max-pool over avg-pool is computing the gradient (in the backprop) will be fast for max-pooling.