Search code examples
pythontensorflowkerasneural-networkdeep-learning

How do I build a permutation invariance neural network in keras?


My question is about the structure of the network required to solve my problem with fewer data.

I have a sensor device that simply reports the color of the thing it's seeing in front of it. One sensor reports me 4 numbers: Red, Green, Blue, and Alpha. The intensity of the color changes depends on the distance and the thing it's seeing. I have 6 such sensors attached on each side of a small cube. The cube can be moved and rotated using the hand.

I want to predict the position of the cube in space in real-time.

My problem:

Input: 6 identical sensors, each giving 4 numbers. Total=6*4=24 numbers.

Output: 3 numbers, X, Y, Z, (Position of the cube)

I have my data ready with label XYZ.

Now, train a simple Multi-layer perceptron which takes 24 numbers and output 3 numbers. This is working quite fine but it needs hell lots of data in a cubic meter space to predict accurately.

The problem is about the rotation. I need to rotate and cover 360 degrees for every location for it to predict well.

But I know for the fact that each sensor is identical so I want to share the weight of each sensor. I know that when you rotate the cube 90 degrees, it should not affect the output position at all. So this should mean the order of the sensors is not important. This means I should somehow use add or average to merge my sensors' layer. If I use concatenate it will preserve the order which make the output position changes.

The way I'm doing it is that I feed 4 numbers into a sensor model that are shared among all sensors, get the encoding, add them up, then connect it to Dense layers. Following is the model prototype:

from keras.layers import Input, Dense
from keras.models import Model, Sequential

sensor1 = Input(shape=(4,))
sensor2 = Input(shape=(4,))
sensor3 = Input(shape=(4,))
sensor4 = Input(shape=(4,))
sensor5 = Input(shape=(4,))
sensor6 = Input(shape=(4,))
sensor_model = Sequential([
    Dense(64, activation='relu'),
    Dense(64, activation='relu'),
])
sensor1_encoding = sensor_model(sensor1)
sensor2_encoding = sensor_model(sensor2)
sensor3_encoding = sensor_model(sensor3)
sensor4_encoding = sensor_model(sensor4)
sensor5_encoding = sensor_model(sensor5)
sensor6_encoding = sensor_model(sensor6)
sensor_encoding = average([
    sensor1_encoding,
    sensor2_encoding,
    sensor3_encoding,
    sensor4_encoding,
    sensor5_encoding,
    sensor6_encoding,
])
h = sensor_encoding
h = Dense(128, activation='relu')(h)
h = Dense(128, activation='relu')(h)
h = Dense(3, activation='linear')
model = Model(inputs=[sensor1, sensor2, sensor3, sensor4, sensor5, sensor6], outputs=[h])

Right now, when I change the average function to concatenate, the model loss is lower on both training and validation set which contradicts with my intuition. What is wrong with my thinking? What do you think? How do I adjust this model in order for it to predict the same position if I rotate 90 degrees while also not suffering from 45 degrees rotation. And also make it in such a way that it does not remove the useful relationship between the inputs.


Solution

  • I have found the solution. And I've written the code for it, for anyone interested please check: https://github.com/offchan42/keras_helpers/blob/master/permutational_layer.py