Search code examples
tflearn

What is the difference between global_max_pool/global_avg_pool and avg_pool_2d/1d/3d?


I've tried to compare the tutorial code for text classification from tflearn : https://github.com/tflearn/tflearn/blob/master/examples/nlp/cnn_sentence_classification.py

And the one from dennybritz : https://github.com/dennybritz/cnn-text-classification-tf

These 2 codes shows different result, i understand that it can be because the tflearn tutorial use 1d convolution, but there is one line of code that i don't understand:

network = global_max_pool(network)

What is the difference between global_max_pool and max_pool_2d?


Solution

  • Looking at the code, they make different calls to the tensor flow library:

    2d_max_pool

    Does broadly what you would expect and returns (as well as doing some other things):

    tf.nn.max_pool(incoming, kernel, strides, padding)
    

    With the specified arguments. This is a 4d tensor, similar to the input one

    global_max_pool

    Actually performs a pretty drastic reduction in the input tensor. The input tensor is of dimension:

    [batch, height, width, in_channels]
    

    The function global_max_pool then returns (as well as doing some other things)

    tf.reduce_max(incoming, [1, 2])
    

    I think gives the maximum value of each tensor along all of each of the in_channels