I am participating in this ASL fingerspelling Kaggle Competition.
We are given a collection of phrases like "9560 plano". Each phrase has a table associated with it. The rows of the table are frame numbers of a video. The columns are the x, y, and z coordinates of 1630 points on a human body. The goal of the competition is to create a model which will recover the phrase from the table.
I have a model which works okay, but it requires some preprocessing of the data.
In particular, I
I would like to make a Layer subclass which does this preprocessing.
I don't expect you to do all of this. I instead request a minimal working example which does something similar.
Please give code for a keras Layer subclass which will:
col 1 | col2 | col3 | col 4 |
---|---|---|---|
a1 | b1 | c1 | d1 |
a2 | b2 | c2 | d2 |
col 1 | col2 | col3 |
---|---|---|
a1 | c1 - b1 | d1 - b1 |
a2 | c2 - b2 | d2 - b1 |
col 1 | col2 | col3 |
---|---|---|
a1 | b1 - c1 | b1 - d1 |
a2 | b2 - c2 | b2 - d2 |
You can use tf.matmul
to get both versions of new data (for cases where c < b and where c >= b). Then you can calculate the mask using tensor comparison. Lastly, use tf.where
to choose which version to use based on the mask value:
import tensorflow as tf
import keras as K
import numpy as np
class MyLayer(K.layers.Layer):
def __init__(self):
super().__init__()
self.tensor1 = tf.constant([
[1, 0, 0 ],
[0, -1, -1],
[0, 1, 0 ],
[0, 0, 1 ]
], dtype='float32')
self.tensor2 = tf.constant([
[1, 0, 0 ],
[0, 1, 1 ],
[0, -1, 0 ],
[0, 0, -1]
], dtype='float32')
def call(self, inputs):
assert(inputs.shape[-1] == 4)
mask = tf.reshape(tf.repeat((inputs[:, 2] > inputs[:, 1]), repeats=3, axis=0), (inputs.shape[0], 3))
return tf.where(mask, tf.matmul(inputs, self.tensor1), tf.matmul(inputs, self.tensor2))
inputs = tf.constant(np.random.randint(0, 20, size=8), dtype='float32', shape=(2, 4))
my_layer = MyLayer()
print('Input:\n', inputs)
print('\nOutput:\n', my_layer(inputs))
outputs:
Input:
tf.Tensor(
[[ 0. 14. 10. 17.]
[ 8. 9. 16. 13.]], shape=(2, 4), dtype=float32)
Output:
tf.Tensor(
[[ 0. 4. -3.]
[ 8. 7. 4.]], shape=(2, 3), dtype=float32)