I have a simple pretrained LSTM model builded with Keras and Tensorflow, I trained, compiled and fitted it, and make a test prediction with a simple sentence, and it works, then I saved my model using model.save(sentanalysis.h5
and everything OK. Then, I loaded this model with model.load_model()
, and it loads without error, but when I tried model.predict()
I got an array with floats that doesn't shows anything related to the classes:
How can I use my pretrained model to make new classifications?
The dataset I use to train it is very simple, a csv with text
and sentiment
columns, nothing else.
Can you help me?
This is the code of the model:
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
import nlp
import random
from keras.preprocessing.text import Tokenizer
from keras_preprocessing.sequence import pad_sequences
dataset = nlp.load_dataset('csv', data_files={'train':'/content/drive/MyDrive/Proyect/BehaviorClassifier/tass2019_pe_train_final.csv',
'test': '/content/drive/MyDrive/Proyect/BehaviorClassifier/tass2019_pe_test_final.csv',
'validation': '/content/drive/MyDrive/Proyect/BehaviorClassifier/tass2019_pe_val_final.csv'})
train = dataset['train']
val = dataset['validation']
test = dataset['test']
def get_tweet(data):
tweets = [x['Text'] for x in data]
labels = [x['behavior'] for x in data]
return tweets, labels
tweets, labels = get_tweet(train)
tokenizer = Tokenizer(num_words=10000, oov_token='<UNK>')
maxlen = 140
def get_sequences(tokenizer, tweets):
sequences = tokenizer.texts_to_sequences(tweets)
padded = pad_sequences(sequences, truncating='post', padding='post', maxlen=maxlen)
return padded
padded_train_seq = get_sequences(tokenizer, tweets)
classes = set(labels)
class_to_index = dict((c, i) for i, c in enumerate(classes))
index_to_class = dict((v, k) for k, v in class_to_index.items())
names_to_ids = lambda labels: np.array([class_to_index.get(x) for x in labels])
train_labels = names_to_ids(labels)
model = tf.keras.models.Sequential([
tf.keras.layers.Embedding(10000, 16, input_length=maxlen),
tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(20, return_sequences=True)),
tf.keras.layers.Dense(6, activation='softmax')
val_tweets, val_labels = get_tweet(val)
val_seq = get_sequences(tokenizer, val_tweets)
val_labels= names_to_ids(val_labels)
h = model.fit(
padded_train_seq, train_labels,
validation_data=(val_seq, val_labels),
#callbacks=[tf.keras.callbacks.EarlyStopping(monitor='val_accuracy', patience=2)]
test_tweets, test_labels=get_tweet(test)
test_seq = get_sequences(tokenizer, test_tweets)
model.evaluate(test_seq, test_labels)
# This code works when I loaded the previos code
sentence = 'I am very happy now'
sequence = tokenizer.texts_to_sequences([sentence])
paddedSequence = pad_sequences(sequence, truncating = 'post', padding='post', maxlen=maxlen)
p = model.predict(np.expand_dims(paddedSequence[0], axis=0))[0]
print('Sentence: ', sentence)
print('Sentiment: ', pred_class)
And this is how I save and load my model withouth loading previous code:
model = keras.models.load_model('/content/drive/MyDrive/Proyect/BehaviorClassifier/twitterBehaviorClassifier.h5')
new = ["I am very happy"]
tokenizer = Tokenizer(num_words=10000, oov_token='<UNK>')
seq = tokenizer.texts_to_sequences(new)
padded = pad_sequences(seq, maxlen=140)
pred = model.predict(padded)
And I get this:
1/1 [==============================] - 0s 29ms/step
[[7.0648360e-01 1.1568426e-01 1.7581969e-01 7.2872970e-04 4.2903548e-04
I've reading some doc, but nothing helped me.
So, from your model code, you have the following:
tf.keras.layers.Dense(6, activation='softmax')
Presumably, you have 6 different sentiment classes. The output you are seeing from your model.predict()
are the probabilities that the input belongs to the corresponding class, i.e. 70.6% chance that the sentiment is class 0, 11.5% that the sentiment is class 1, 17.5% that the sentiment is class 2, etc.
So what is typically done to postprocess these results is take the largest probability as the prediction using np.argmax(pred)
, which in the case you posted should give you 0
, which then can be interpreted as your model believes your tweet is 70.6% likely to belong to class zero.