I am trying to replace my FFNN with a LSTM Layer. As input I get 360 Lidar Data Point and 4 additional values for distance etc.. The algorithm shall learn to navigate a robot through it's environment. With the FFNN it's working absolutely fine and for the LSTM I started like that:
# collected data for RL
scan_range = [] #filled with .append, length=360
state = scan_range + [heading, current_distance, obstacle_min_range, obstacle_angle]
return np.asarray(state)
Based on that data, there will be some analysis for the next state, if the goal is achieved etc. The data will be stored in memory:
agent.appendMemory(state, action, reward, next_state, done)
which will do:
self.memory.append((state, action, reward, next_state, done)
. The action and reward are normal numbers and next_state is an array again.
Next, I build up the neural network with the LSTM Layer
model = Sequential()
model.add(SimpleRNN(64, input_shape=(1,364)))
model.add(Dense(self.action_size, kernel_initializer='lecun_uniform'))
model.add(Activation('linear'))
model.compile(loss='mse', optimizer=RMSprop(lr=self.learning_rate, rho=0.9, epsilon=1e-06))
model.summary()
It is then trained everything using a batch like the following for the FFNN:
def trainModel(self, target=False):
mini_batch = random.sample(self.memory, self.batch_size)
X_batch = np.empty((0, self.state_size), dtype=np.float64)
Y_batch = np.empty((0, self.action_size), dtype=np.float64)
for i in range(self.batch_size):
states = mini_batch[i][0]
actions = mini_batch[i][1]
rewards = mini_batch[i][2]
next_states = mini_batch[i][3]
dones = mini_batch[i][4]
q_value = self.model.predict(states.reshape((1, len(states))))
self.q_value = q_value
if target:
next_target = self.target_model.predict(next_states.reshape((1, len(next_states))))
else:
next_target = self.model.predict(next_states.reshape((1, len(next_states))))
next_q_value = self.getQvalue(rewards, next_target, dones)
X_batch = np.append(X_batch, np.array([states.copy()]), axis=0)
Y_sample = q_value.copy()
Y_sample[0][actions] = next_q_value
Y_batch = np.append(Y_batch, np.array([Y_sample[0]]), axis=0)
if dones:
X_batch = np.append(X_batch, np.array([next_states.copy()]), axis=0)
Y_batch = np.append(Y_batch, np.array([[rewards] * self.action_size]), axis=0)
print X_batch.shape
print Y_batch.shape
self.model.fit(X_batch, Y_batch, batch_size=self.batch_size, epochs=1, verbose=0)
When I don't change the code, I sure get the error of dimension: expected simple_rnn_1_input to have 3 dimensions, but got array with shape (1, 364)
because the input is still two dimensional and the LSTM need three dimensions. I then tried to add the third dimension manually to just see if everything works fine:
mini_batch = random.sample(self.memory, self.batch_size)
X_batch = np.empty((0, self.state_size), dtype=np.float64)
Y_batch = np.empty((0, self.action_size), dtype=np.float64)
Z_batch = np.empty((0, 1), dtype=np.float64)
for i in range(self.batch_size):
states = mini_batch[i][0]
actions = mini_batch[i][1]
rewards = mini_batch[i][2]
next_states = mini_batch[i][3]
dones = mini_batch[i][4]
q_value = self.model.predict(states.reshape((1, len(states))))
self.q_value = q_value
if target:
next_target = self.target_model.predict(next_states.reshape((1,1, len(next_states))))
else:
next_target = self.model.predict(next_states.reshape((1,1, len(next_states))))
next_q_value = self.getQvalue(rewards, next_target, dones)
X_batch = np.append(X_batch, np.array([states.copy()]), axis=0)
Y_sample = q_value.copy()
Y_sample[0][actions] = next_q_value
Y_batch = np.append(Y_batch, np.array([Y_sample[0]]), axis=0)
Z_batch = np.append(Z_batch, np.array([[1]]), axis=0)
if dones:
X_batch = np.append(X_batch, np.array([next_states.copy()]), axis=0)
Y_batch = np.append(Y_batch, np.array([[rewards] * self.action_size]), axis=0)
Z_batch = np.append(Z_batch, np.array([[1]]), axis=0)
self.model.fit(X_batch, Y_batch, Z_batch, batch_size=self.batch_size, epochs=1, verbose=0)
When I do this, the .fit() gives the following error: TypeError: fit() got multiple values for keyword argument 'batch_size'
My question is now, if .fit() is suited for the LSTM framework in this case? In the documentation, only x and z are given. Z seems useless in this case, but still the LSTM needs a 3 dimensions as input.
Also my question is, if I want to use the LSTM framework properly and not with dummies, I have to use more than the actual state?
Can I then, i.e., just append together the last 10 states so that states.shape=(10,1,364), is that a good timestep range or should it be longer?
Kind regards!
I believe your basic issue is that the 3rd dimension needs to be added to X_batch, and not another component in model.fit.
In particular, Keras models don't usually specify the "batch"/"sample" dimension in the model layers. It is automatically inferred from the shape of the X_batch input data. In your case, you have an SimpleRNN with input_shape=(1,364) as the first layer. What Keras interprets this to mean is that the input data X_batch should have shape like this:
(num_samples, 1, 364).
Also, if you want to create a sequence of timesteps, you would provide X_batch with the following shape:
(num_samples, num_timesteps, 364) or something similar.
This page has some good discussion: https://keras.io/getting-started/sequential-model-guide/ for example, search for "Stacked LSTM for sequence classification" to help illustrate (although be careful of the return_sequences=True - for a single LSTM, you probably want return_sequences=False.)
I hope this helps.