tensorflow machine-learning keras nlp keras-layer

how to train model in which labels is [5,30]?

How to train on a dataset which has each label of shape [5,30]. For example :

[
       [ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0, 54, 55, 21, 56, 57,  3,
        22, 19, 58,  6, 59,  4, 60,  1, 61, 62, 23, 63, 23, 64],
       [ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  0,  0,  1, 65,  7, 66,  2, 67, 68,  3, 69, 70],
       [ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
         0, 11, 12,  5, 13, 14,  9, 10,  5, 15, 16, 17,  2,  8],
       [ 0,  0,  0,  0,  0,  2, 71,  1, 72, 73, 74,  7, 75, 76, 77,  3,
        20, 78, 18, 79,  1, 21, 80, 81,  3, 82, 83, 84,  6, 85],
       [ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  2,
        86, 87,  3, 88, 89,  1, 90, 91, 22, 92, 93,  4,  6, 94]
]

One way was to reshape the labels to [150], but that will make the tokenized sentences lose their meanings. Please suggest me how to arrange the keras layers and which layers to be able to make the model? I want to able to generate sentences later.

My code for model right now is this.

model = tf.keras.Sequential([ feature_layer, 
layers.Dense(128, activation='relu'), 
layers.Dense(128, activation='relu'), 
layers.Dropout(.1), 
layers.Dense(5), 
layers.Dense(30, activation='softmax'), ]) 
opt = Adam(learning_rate=0.01) 
model.compile(optimizer=opt, loss='mean_absolute_percentage_error', metrics=['accuracy'])

The actual data.

state	district	month	rainfall	max_temp	min_temp	max_rh	min_rh	wind_speed	advice
Orissa	Kendrapada	february	0.0	34.6	19.4	88.2	29.6	12.0	chances of foot rot disease in paddy crop; apply urea at 3 weeks after transplanting at active tillering stage for paddy;......
Jharkhand	Saraikela Kharsawan	february	0	35.2	16.6	29.4	11.2	3.6	provide straw mulch and go for intercultural operations to avoid moisture losses from soil; chance of leaf blight disease in potato crop; .......

I need to be able to generate the advices.

Solution

If you do consider that the output needs to be in this shape (and not flattened), the easiest (and also correct solution in my opinion) is to have a multi-output network, each output having a layers.Dense(30,activation='softmax').

You would have something like:

def create_model():
    base_model = .... (stacked Dense units + other) # you can even create multi-input multi-output if you really want that.
    first_output = Dense(30,activation='softmax',name='output_1')(base_model) 
    second_output = Dense(30,activation='softmax',name='output_2')(base_model)
    ...
    fifth_output = Dense(30,activation='softmax',name='output_5')(base_model)
    model = Model(inputs=input_layer,
                  outputs=[first_output,second_output,third_output,fourth_output,fifth_output])
    return model


optimizer = tf.keras.optimizers.Adam()
model.compile(optimizer=optimizer,
              loss={'output_1': 'sparse_categorical_crossentropy', 
                    'output_2': 'sparse_categorical_crossentropy',
                    'output_3': 'sparse_categorical_crossentropy',
                    'output_4': 'sparse_categorical_crossentropy',
                    'output_5': 'sparse_categorical_crossentropy'},
              metrics={'output_1':tf.keras.metrics.Accuracy(),
                       'output_2':tf.keras.metrics.Accuracy(),
                       'output_3':tf.keras.metrics.Accuracy(),
                       'output_4':tf.keras.metrics.Accuracy(),
                       'output_5':tf.keras.metrics.Accuracy()})

model.fit(X, y,
          epochs=100, batch_size=10, validation_data=(val_X, val_y))

Here, note that y (both for train and valid) is a a numpy array of length 5 (number of outputs) and each element has length 30.

Again, ensure that you actually need such a configuration; I posted the answer as a demonstration of multi-output label in TensorFlow and Keras and for the benefit of the others, but I am not 100% sure you actually need this exact configuration (perhaps you can opt for something easier).

Note the usage of sparse_categorical_crossentropy, since your labels are not one-hot encoded (also MAPE is for regression, not classification).