Search code examples
tensorflowkeras

Preprocessing and feature selection in a custom keras layer


I am participating in this ASL fingerspelling Kaggle Competition.

We are given a collection of phrases like "9560 plano". Each phrase has a table associated with it. The rows of the table are frame numbers of a video. The columns are the x, y, and z coordinates of 1630 points on a human body. The goal of the competition is to create a model which will recover the phrase from the table.

I have a model which works okay, but it requires some preprocessing of the data.

In particular, I

  • Identify which hand is being used to spell the phrase.
  • If it is the left hand, I reflect the hand in the yz plane, and overwrite the righthand coordinates with these new coordinates. Essentially this makes everyone "right-handed" in my model.
  • I then subtract the vector pointing to the wrist from the vector pointing to each point in the right hand. This "stabilizes" the hand.
  • In the end I only have 60 features I care about (right_hand_x_1, right_hand_x_2, ..., right_hand_z_20).

I would like to make a Layer subclass which does this preprocessing.

I don't expect you to do all of this. I instead request a minimal working example which does something similar.

Please give code for a keras Layer subclass which will:

  • take a 2x4 2D input tensor
col 1 col2 col3 col 4
a1 b1 c1 d1
a2 b2 c2 d2
  • if c_1 > b_1 returns the 2x3 tensor
col 1 col2 col3
a1 c1 - b1 d1 - b1
a2 c2 - b2 d2 - b1
  • if c_1 < b_1 returns the 2x3 tensor
col 1 col2 col3
a1 b1 - c1 b1 - d1
a2 b2 - c2 b2 - d2

Solution

  • You can use tf.matmul to get both versions of new data (for cases where c < b and where c >= b). Then you can calculate the mask using tensor comparison. Lastly, use tf.where to choose which version to use based on the mask value:

    import tensorflow as tf
    import keras as K
    import numpy as np
    
    class MyLayer(K.layers.Layer):
        def __init__(self):
            super().__init__()
            self.tensor1 = tf.constant([
                [1, 0,  0 ],
                [0, -1, -1],
                [0, 1,  0 ],
                [0, 0,  1 ]
            ], dtype='float32')
            self.tensor2 = tf.constant([
                [1, 0,  0 ],
                [0, 1,  1 ],
                [0, -1, 0 ],
                [0, 0,  -1]
            ], dtype='float32')
    
        def call(self, inputs):
            assert(inputs.shape[-1] == 4)
            mask = tf.reshape(tf.repeat((inputs[:, 2] > inputs[:, 1]), repeats=3, axis=0), (inputs.shape[0], 3))
            return tf.where(mask, tf.matmul(inputs, self.tensor1), tf.matmul(inputs, self.tensor2))
    
    
    inputs = tf.constant(np.random.randint(0, 20, size=8), dtype='float32', shape=(2, 4))
    my_layer = MyLayer()
    print('Input:\n', inputs)
    print('\nOutput:\n', my_layer(inputs))
    

    outputs:

    Input:
     tf.Tensor(
    [[ 0. 14. 10. 17.]
     [ 8.  9. 16. 13.]], shape=(2, 4), dtype=float32)
    
    Output:
     tf.Tensor(
    [[ 0.  4. -3.]
     [ 8.  7.  4.]], shape=(2, 3), dtype=float32)