python tensorflow keras deep-learning sentiment-analysis

Sentiment Analysis: Fitting a model result in value error (shapes incompatible?)

I am doing a sentiment analysis on a set of reviews --> predicting the rating (0-5) based on the text review. I have completed text pre-processing and tokenizing. I am using a pre-trained word vector embeddings (googlenews) and created the embedding_matrix.

I have built the model thus far:

#defining X (padded) and y and completing train/test split
X = pad_sequences(sequences, maxlen= 1000)
y = df['rating']
X_train, X_test, y_train, y_test = train_test_split(X,y, test_size = 0.25, random_state = 1000)

y_train = to_categorical(y_train,6)

#building the model
sentiment_wv_model = Sequential()
embed_layer = Embedding(vocab_size, 100,weights = [embedding_matrix], input_length = 1000,trainable = True)

sentiment_wv_model.add(embed_layer)
sentiment_wv_model.add(Dense(100, activation = 'sigmoid'))
sentiment_wv_model.add(Dense(32, activation = 'sigmoid'))
sentiment_wv_model.add(Dense(1, activation='softmax'))


#compile model and fit to train data
sentiment_wv_model.compile(loss = 'categorical_crossentropy',optimizer = 'adam', metrics =['accuracy'])

sentiment_wv_model.summary
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 embedding_1 (Embedding)     (None, 1000, 100)         3631400   
                                                                 
 dense (Dense)               (None, 1000, 100)         10100     
                                                                 
 dense_1 (Dense)             (None, 1000, 32)          3232      
                                                                 
 dense_2 (Dense)             (None, 1000, 2)           66        
                                                                 
 dense_3 (Dense)             (None, 1000, 1)           3         
                                                                 
 dense_4 (Dense)             (None, 1000, 100)         200       
                                                                 
 dense_5 (Dense)             (None, 1000, 32)          3232      
                                                                 
 dense_6 (Dense)             (None, 1000, 1)           33        
                                                                 
=================================================================
Total params: 3,648,266
Trainable params: 3,648,266
Non-trainable params: 0
_________________________________________________________________


sentiment_wv_model.fit(X_train, y_train, batch_size = 32, epochs = 5, verbose =2)

Running this, I get the following error:

ValueError: in user code:

    File "C:\Users\tammy\Anaconda3\lib\site-packages\keras\engine\training.py", line 878, in train_function  *
        return step_function(self, iterator)
    File "C:\Users\tammy\Anaconda3\lib\site-packages\keras\engine\training.py", line 867, in step_function  **
        outputs = model.distribute_strategy.run(run_step, args=(data,))
    File "C:\Users\tammy\Anaconda3\lib\site-packages\keras\engine\training.py", line 860, in run_step  **
        outputs = model.train_step(data)
    File "C:\Users\tammy\Anaconda3\lib\site-packages\keras\engine\training.py", line 809, in train_step
        loss = self.compiled_loss(
    File "C:\Users\tammy\Anaconda3\lib\site-packages\keras\engine\compile_utils.py", line 201, in __call__
        loss_value = loss_obj(y_t, y_p, sample_weight=sw)
    File "C:\Users\tammy\Anaconda3\lib\site-packages\keras\losses.py", line 141, in __call__
        losses = call_fn(y_true, y_pred)
    File "C:\Users\tammy\Anaconda3\lib\site-packages\keras\losses.py", line 245, in call  **
        return ag_fn(y_true, y_pred, **self._fn_kwargs)
    File "C:\Users\tammy\Anaconda3\lib\site-packages\keras\losses.py", line 1664, in categorical_crossentropy
        return backend.categorical_crossentropy(
    File "C:\Users\tammy\Anaconda3\lib\site-packages\keras\backend.py", line 4994, in categorical_crossentropy
        target.shape.assert_is_compatible_with(output.shape)

    ValueError: Shapes (None, 6, 6) and (None, 1000, 1) are incompatible

I see that this type of question has been asked a few times, but I have tried other solutions such as putting y as 'to_categorical', changing the activation functions or switching to 'binary_crossentropy (the last two didn't make sense to me but I tried it anyway). Please advise!

Solution

You are currently having a sparse tensor for your y-values:

y_train = to_categorical(y_train,6)

This sould have the shape [1000,6] which you can check with y_train.shape().

One thing that should be working is simply changing the size of your output layer to 6:

sentiment_wv_model.add(Dense(6, activation='softmax'))

[Optional] After this you can also change your loss to sparse_categorical_crossentropy:

sentiment_wv_model.compile(loss = tf.keras.losses.SparseCategoricalCrossentropy,optimizer = 'adam', metrics =['accuracy'])

Also, you should consider flattening your data after the Embedding layer, so that you get the output shape (None, 6) instead of (None, 1000, 6)