I have some text without any labels. Just a bunch of text files. And I want to train an Embedding layer to map the words to embedding vectors. Most of the examples I've seen so far are like this:
from keras.models import Sequential
from keras.layers import Embedding, Flatten, Dense
model = Sequential()
model.add(Embedding(max_words, embedding_dim, input_length=maxlen))
model.add(Flatten())
model.add(Dense(32, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.summary()
model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['acc'])
model.fit(x_train, y_train,
epochs=10,
batch_size=32,
validation_data=(x_val, y_val))
They all assume that the Embedding layer is part of a bigger model which tries to predict a label. But in my case, I have no label. I'm not trying to classify anything. I just want to train the mapping from words (more precisely integers) to embedding vectors. But the fit
method of the model, asks for x_train
and y_train
(as the example given above).
How can I train a model only with an Embedding layer and no labels?
[UPDATE]
Based on the answer I've got from @Daniel Möller, Embedding layer in Keras is implementing a supervised algorithm and thus cannot be trained without labels. Initially, I was thinking that it is a variation of Word2Vec and thus does not need labels to be trained. Apparently, that's not the case. Personally, I ended up using the FastText which has nothing to do with Keras or Python.
Does it make sense to do that without a label/target?
How will your model decide which values in the vectors are good for anything if there is no objective?
All embeddings are "trained" for a purpose. If there is no purpose, there is no target, if there is no target, there is no training.
If you really want to transform words in vectors without any purpose/target, you've got two options:
to_categorical
function for that. Warning: I don't really know anything about Word2Vec, but I'll try to show how to add the rules for your embedding using some naive kind of word distance and how to use dummy "labels" just to satisfy Keras' way of training.
from keras.layers import Input, Embedding, Subtract, Lambda
import keras.backend as K
from keras.models import Model
input1 = Input((1,)) #word1
input2 = Input((1,)) #word2
embeddingLayer = Embedding(...params...)
word1 = embeddingLayer(input1)
word2 = embeddingLayer(input2)
#naive distance rule, subtract, expect zero difference
word_distance = Subtract()([word1,word2])
#reduce all dimensions to a single dimension
word_distance = Lambda(lambda x: K.mean(x, axis=-1))(word_distance)
model = Model([input1,input2], word_distance)
Now that our model outputs directly a word distance, our labels will be "zero", they're not really labels for a supervised training, but they're the expected result of the model, something necessary for Keras to work.
We can have as loss function the mae
(mean absolute error) or mse
(mean squared error), for instance.
model.compile(optimizer='adam', loss='mse')
And training with word2 being the word after word1:
xTrain = entireText
xTrain1 = entireText[:-1]
xTrain2 = entireText[1:]
yTrain = np.zeros((len(xTrain1),))
model.fit([xTrain1,xTrain2], yTrain, .... more params.... )
Although this may be completely wrong regarding what Word2Vec really does, it shows the main points that are: