So basically I have an LSTM model which takes in a bunch of numbers (These numbers are actually music notes that I changed into numbers. My goal is to create computer generated music if you were wondering). The issue I am running into is that I do not know how to make a prediction. What I want the computer to output is a list (or string or whatever it can) of numbers that follows whatever rules it came up with during the training proccess. In previous projects, I only knew how to output 1 prediction number, giving the computer some data to predict on, but I want a completely new list without giving the computer a starting value. Preferably the computer can generate more than 1 number at a time.
Here is the code that I currently have. It does not work right now:
n_steps = 1
X, y = split_sequence(data, n_steps)
X = X.reshape((X.shape[0], X.shape[1], 1))
X = tf.cast(X, dtype='float32')
model = Sequential()
model.add(LSTM(256, activation='relu', return_sequences=True))
#model.add(Dropout(0.2)) # I am not sure what this is, but it doesn't break my code
model.add(LSTM(128, activation='relu', return_sequences=True))
#model.add(Dropout(0.2))
model.add(LSTM(128))
#model.add(Dropout(0.2))
model.add(Dense(1, activation='linear'))
model.compile(optimizer='adam', loss='mse', metrics=['mae'])
model.fit(X, y, epochs=10, batch_size=2, verbose=2)
prediction = model.predict(X) # I want to output a list of numbers
print(prediction)
Now, my prediction is outputting a realy long list of lists containing the same value which I think is the one prediction. It looks like this:
[[62.449333]
[62.449333]
[62.449333]
...
[62.449333]
[62.449333]
[62.449333]]
I want a list that is not a prediction, but more like a GAN output of a brand new list of numbers. Also, I am not sure why this prediction is outputing a really long list of lists.
data looks something like this, it is shortened for brevity:
[64, 76, 64, 75, 64, 76, 64, 75, 64, 76, 64, 71, 64, 74, 64, 72, 69, 64, 45, 64, 52]
The x train looks like this when n_steps = 1:
[[64], [76], [64], [75], [64], [76], [64], [75], [64], [76], [64], [71], [64], [74]]
and y looks like this, with each one being the expectpected output for the corresponding x train:
[76, 64, 75, 64, 76, 64, 75, 64, 76, 64, 71, 64, 74, 64]
Any help will be greatly appreciated!!
I think the structure of your model is fine but the data needs some work. Your LSTM is only set up to output 1 value, which you can see with your last LSTM layer not having return_sequences=True
. The fact that your y labels have multiple values must be confusing the model.
I think you should keep this behaviour, but edit your input/output data as follows:
If one sequence in your data is:
[64, 76, 64, 75, 64, 76, 64, 75, 64, 76, 64, 71, 64, 74, 64, 72, 69, 64, 45, 64, 52]
Then your training examples and labels should be:
x[0] = [64]
y[1] = [76]
x[1] = [64, 76]
y[1] = [64]
x[2] = [64, 76, 64]
y[2] = [75]
Every step of the sequence can be a separate training example, but each y label should only be one output.
Your linear output could work, but I think this may work better as a categorical problem with a softmax output. The final dense layer should have the number of possible notes that your model can output. You would also have to pad these sequences with 0 values so that all your x input values are the same length, so x
values would actually be:
x[0] = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 64]
x[1] = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 64, 75]
etc. with the length of the array being your max sequence length.
When it comes to predicting, use a loop. You'd give the model a one value inut sequence, then append the predicted note to the input sequence and feed it back to the model again:
seed_note = [64] # initial note to give the model
next_notes = 10 # how many notes to predict
for _ in range(next_notes):
token_list = pad_sequences(seed_note, maxlen=max_sequence_len, padding='pre') # pad sequence with 0s
predicted = np.argmax(model.predict(token_list), axis=-1) # get best prediction for next note
seed_note += [predicted]
print(seed_text)