I am new to this field and I was reading a paper "Predicting citation counts based on deep neural network learning techniques". There the authors describe the code that they implemented if someone wants to reproduce the results. I tried to do this but I am not sure if I succeeded.
Here is their description:
-RNN module - SimpleRNN
-Output dimension of the encoder - 512
-The output layer - Dense layer
-Activation function - ReLU
-Overfitting prevention technique - Dropout with 0.2 rate
-Epochs - 100
Optimization algorithm - RMSProp
Learning rate - 10^{-5}
Batch size - 256
And here is my implementation. I am not sure if the model I created is sequence to sequence.
epocsh = 100
batch_size = 256
optimizer = keras.optimizers.RMSprop(lr=0.00001)
model = keras.models.Sequential([
keras.layers.SimpleRNN(512, input_shape=[X_train.shape[0], X_train.shape[1]],
activation='relu', return_sequences=True, dropout=0.2),
keras.layers.Dense(9)
])
model.compile(loss='mse', optimizer=optimizer, metrics=[keras.metrics.RootMeanSquaredError()])
The summary of this model is:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
simple_rnn (SimpleRNN) (None, 154521, 512) 266240
_________________________________________________________________
dense (Dense) (None, 154521, 9) 4617
=================================================================
Total params: 270,857
Trainable params: 270,857
Non-trainable params: 0
_________________________________________________________________
Update: Is this maybe the correct way to formulate this?
encoder = keras.layers.SimpleRNN(512,
input_shape=[X_train.shape[0], X_train.shape[1]],
activation='relu',
return_sequences=False,
dropout=0.2)
decoder = keras.layers.SimpleRNN(512,
input_shape=[X_train.shape[0], X_train.shape[1]],
activation='relu',
return_sequences=True,
dropout=0.2)
output = keras.layers.Dense(9)(decoder)
This is the dataset that I am using.
year venue c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 c13 c14
1989 234 0 1 2 3 4 5 5 5 5 8 8 10 11 12
1989 251 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1990 346 0 0 0 0 0 0 0 0 0 0 0 0 0 0
I need to give as an input all the columns until c5, and try to predict the other c's (which are citation count for the upcoming years). Is this the right way to go forward?
Your model is token classification model not sequence-to-sequence.
Seq-2-seq model comprise of encoder and decoder (the both are RNN in your case). It can not be created with Sequentional API because there are separate inputs for encoder and decoder.
The encoder should be created with argument return_sequences=False
.
Dense layer should follow the decoder.
It should be something like that:
encoder_input = Input(shape=(None, 512))
decoder_input = Input(shape=(None, 512))
encoder_output = keras.layers.SimpleRNN(512,
activation='relu',
return_sequences=False,
dropout=0.2)(encoder_input)
encoder_output = encoder_output[:, tf.newaxis, ...]
decoder_inputs = tf.concat([encoder_output, decoder_input], 1)
decoder_output = keras.layers.SimpleRNN(512,
activation='relu',
return_sequences=True,
dropout=0.2)(decoder_inputs)
output = keras.layers.Dense(9)(decoder_output)
model_att = tf.keras.models.Model([encoder_input, decoder_input], output )
model_att.compile(optimizer=ADAM, loss='sparse_categorical_crossentropy')
model_att.summary()