I'm trying to draw in my mind the structure of the LSTMs and I don't understand what are the Kernel and Recurrent Kernel. According to this post in LSTMs section, the Kernel it's the four matrices that are multiplied by the inputs and Recurrent Kernel it's the four matrices that are multiplied by the hidden state, but, what are those 4 matrices in this diagram?
Are the gates?
I was testing with this app how the unit
variable of the code below affect the kernel, recurrent kernel and bias:
model = Sequential()
model.add(LSTM(unit = 1, input_shape=(1, look_back)))
with look_back = 1
it returns me that:
with unit = 2
it returns me this
With unit = 3
this
Testing with this values I could deducted this expressions
but I don't know how this works by inside. What does mean <1x(4u)>
or <ux(4u)>
? u = units
The kernels are basically the weights handled by the LSTM cell
units = neurons, like the classic multilayer perceptron
It is not shown in your diagram, but the input is a vector X with 1 or more values, and each value is sent in a neuron with its own weight w (the which we are going to learn with backpropagation)
The four matrices are these (expressed as Wf, Wi, Wc, Wo):
When you add a neuron, you are adding other 4 weights\kernel
So for your input vector X you have four matrix. And therefore
1 * 4 * units = kernel
Regarding the recurrent_kernel
here you can find the answer.
Basically in keras input and hidden state are not concatenated like in the example diagrams (W[ht-1, t]) but they are split and handled with other four matrices called U:
Because you have a hidden state x neuron, the weights U (all four U) are:
units * (4 * units) = recurrent kernel
ht-1 comes in a recurrent way from all your neurons. Like in a multilayer perceptron, each output of a neuron goes in all the next recurrent layer neurons
source: http://colah.github.io/posts/2015-08-Understanding-LSTMs/