Search code examples
pythontensorflowmachine-learningkerasdeep-learning

How to get the mean of each image in a batch?


I have a batch of images thus the shape [None, 256, 256, 3] (the batch is set to none for practical purposes on use).

I am trying to implement a layer that calculates the average of each of the of images or frames in the batch to result the shape [None, 1] or [None, 1, 1, 1]. I have checked to use tf.keras.layers.Average, but apparently it calculates across the batch, returning a tensor of the same shape.

In hindsight I tried implementing the following custom layer:

class ElementMean(tf.keras.layers.Layer):
    def __init__(self, **kwargs):
        super(ElementMean, self).__init__(**kwargs)
    
    def call(self, inputs):
        tensors = []
        for ii in range(inputs.shape[0] if inputs.shape[0] is not None else 1):
            tensors.append(inputs[ii, ...])
        return tf.keras.layers.Average()(tensors)

but when it is used:

import tensorflow as tf

x = tf.keras.Input([256, 256, 3], None)
y = ElementMean()(x)

model = tf.keras.Model(inputs=x, outputs=y)
model.compile()
model.summary()
tf.keras.utils.plot_model(
    model,
    show_shapes=True,
    show_dtype=True,
    show_layer_activations=True,
    show_layer_names=True
)

I get the result:

Model: "model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #
=================================================================
 input_1 (InputLayer)        [(None, 256, 256, 3)]     0

 element_mean (ElementMean)  (256, 256, 3)             0

=================================================================
Total params: 0
Trainable params: 0
Non-trainable params: 0
_________________________________________________________________

enter image description here

Which makes it entirely wrong.

I also tried this change on the call:

    def call(self, inputs):
        tensors = []
        for ii in range(inputs.shape[0] if inputs.shape[0] is not None else 1):
            tensors.append(tf.reduce_mean(inputs[ii, ...]))
        return tf.convert_to_tensor(tensors)

Which in turn results to:

Model: "model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #
=================================================================
 input_1 (InputLayer)        [(None, 256, 256, 3)]     0

 element_mean (ElementMean)  (1,)                      0

=================================================================
Total params: 0
Trainable params: 0
Non-trainable params: 0
_________________________________________________________________

enter image description here

Which is also wrong.


Solution

  • You can play around with the axes like this:

    import tensorflow as tf
    
    class ElementMean(tf.keras.layers.Layer):
        def __init__(self, **kwargs):
            super(ElementMean, self).__init__(**kwargs)
        
        def call(self, inputs):
            return tf.reduce_mean(inputs, axis=(1, 2, 3), keepdims=True)
    
    x = tf.keras.layers.Input([256, 256, 3], None)
    em = ElementMean()
    y = em(x)
    model = tf.keras.Model(x, y)
    model.summary()
    
    Model: "model_1"
    _________________________________________________________________
     Layer (type)                Output Shape              Param #   
    =================================================================
     input_1 (InputLayer)        [(None, 256, 256, 3)]     0         
                                                                     
     element_mean_1 (ElementMean  (None, 1, 1, 1)          0         
     )                                                               
                                                                     
    =================================================================
    Total params: 0
    Trainable params: 0
    Non-trainable params: 0
    _________________________________________________________________