Search code examples
tensorflowmachine-learningkerasnlpkeras-layer

how to train model in which labels is [5,30]?


How to train on a dataset which has each label of shape [5,30]. For example :

[
       [ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0, 54, 55, 21, 56, 57,  3,
        22, 19, 58,  6, 59,  4, 60,  1, 61, 62, 23, 63, 23, 64],
       [ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  0,  0,  1, 65,  7, 66,  2, 67, 68,  3, 69, 70],
       [ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
         0, 11, 12,  5, 13, 14,  9, 10,  5, 15, 16, 17,  2,  8],
       [ 0,  0,  0,  0,  0,  2, 71,  1, 72, 73, 74,  7, 75, 76, 77,  3,
        20, 78, 18, 79,  1, 21, 80, 81,  3, 82, 83, 84,  6, 85],
       [ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  2,
        86, 87,  3, 88, 89,  1, 90, 91, 22, 92, 93,  4,  6, 94]
]

One way was to reshape the labels to [150], but that will make the tokenized sentences lose their meanings. Please suggest me how to arrange the keras layers and which layers to be able to make the model? I want to able to generate sentences later.

My code for model right now is this.

model = tf.keras.Sequential([ feature_layer, 
layers.Dense(128, activation='relu'), 
layers.Dense(128, activation='relu'), 
layers.Dropout(.1), 
layers.Dense(5), 
layers.Dense(30, activation='softmax'), ]) 
opt = Adam(learning_rate=0.01) 
model.compile(optimizer=opt, loss='mean_absolute_percentage_error', metrics=['accuracy']) 

The actual data.

state district month rainfall max_temp min_temp max_rh min_rh wind_speed advice
Orissa Kendrapada february 0.0 34.6 19.4 88.2 29.6 12.0 chances of foot rot disease in paddy crop; apply urea at 3 weeks after transplanting at active tillering stage for paddy;......
Jharkhand Saraikela Kharsawan february 0 35.2 16.6 29.4 11.2 3.6 provide straw mulch and go for intercultural operations to avoid moisture losses from soil; chance of leaf blight disease in potato crop; .......

I need to be able to generate the advices.


Solution

  • If you do consider that the output needs to be in this shape (and not flattened), the easiest (and also correct solution in my opinion) is to have a multi-output network, each output having a layers.Dense(30,activation='softmax').

    You would have something like:

    def create_model():
        base_model = .... (stacked Dense units + other) # you can even create multi-input multi-output if you really want that.
        first_output = Dense(30,activation='softmax',name='output_1')(base_model) 
        second_output = Dense(30,activation='softmax',name='output_2')(base_model)
        ...
        fifth_output = Dense(30,activation='softmax',name='output_5')(base_model)
        model = Model(inputs=input_layer,
                      outputs=[first_output,second_output,third_output,fourth_output,fifth_output])
        return model
    
    
    optimizer = tf.keras.optimizers.Adam()
    model.compile(optimizer=optimizer,
                  loss={'output_1': 'sparse_categorical_crossentropy', 
                        'output_2': 'sparse_categorical_crossentropy',
                        'output_3': 'sparse_categorical_crossentropy',
                        'output_4': 'sparse_categorical_crossentropy',
                        'output_5': 'sparse_categorical_crossentropy'},
                  metrics={'output_1':tf.keras.metrics.Accuracy(),
                           'output_2':tf.keras.metrics.Accuracy(),
                           'output_3':tf.keras.metrics.Accuracy(),
                           'output_4':tf.keras.metrics.Accuracy(),
                           'output_5':tf.keras.metrics.Accuracy()})
    
    model.fit(X, y,
              epochs=100, batch_size=10, validation_data=(val_X, val_y))
    

    Here, note that y (both for train and valid) is a a numpy array of length 5 (number of outputs) and each element has length 30.

    Again, ensure that you actually need such a configuration; I posted the answer as a demonstration of multi-output label in TensorFlow and Keras and for the benefit of the others, but I am not 100% sure you actually need this exact configuration (perhaps you can opt for something easier).

    Note the usage of sparse_categorical_crossentropy, since your labels are not one-hot encoded (also MAPE is for regression, not classification).