Search code examples
pythontensorflowneural-networkrecurrent-neural-networkautoencoder

Layering softmax classifier into RNN autoencoder


The paper I'm implementing is using an RNN with autoencoder to classify anomalous network data(binary classification). They first train the model unsupervised, and then they describe this process:

Next, fine-tuning training (supervised) is conducted to train the last layer of the network using labeled samples. Implementing the fine-tuning using supervised training criterion can further optimize the whole network. We use softmax regression layer with two channels at the top layer

Currently, I've implemented the autoencoder:

class AnomalyDetector(Model):
    def __init__(self):
        super(AnomalyDetector, self).__init__()
        self.encoder = tf.keras.Sequential([
            layers.Dense(64, activation="relu"),
            layers.Dense(32, activation="relu"),
            layers.Dense(16, activation="relu"),
            layers.Dense(8, activation="relu")])

        self.decoder = tf.keras.Sequential([
            layers.Dense(16, activation="relu"),
            layers.Dense(32, activation="relu"),
            layers.Dense(64, activation="relu"),
            layers.Dense(79, activation='relu')
        ])

How do you implement the softmax regression layer in TensorFlow?

I'm having trouble understanding the process, am I supposed to add another layer to the autoencoder? Am I supposed to add another function to the class?


Solution

  • Just in case anyone in the future visits this - You can create a softmax layer by changing the activation. I chose a sigmoid activation in my case since sigmoid is equivalent to a two-element softmax. As per the documentation.

    class AnomalyDetector(Model):
        def __init__(self):
            super(AnomalyDetector, self).__init__()
            self.pretrained = False
            self.finished_training = False
            self.encoder = tf.keras.Sequential([
                layers.SimpleRNN(64, activation="relu", return_sequences=True),
                layers.SimpleRNN(32, activation="relu", return_sequences=True),
                layers.SimpleRNN(16, activation="relu", return_sequences=True),
                layers.SimpleRNN(8, activation="relu", return_sequences=True)])
    
            self.decoder = tf.keras.Sequential([
                layers.SimpleRNN(16, activation="relu", return_sequences=True),
                layers.SimpleRNN(32, activation="relu", return_sequences=True),
                layers.SimpleRNN(64, activation="relu", return_sequences=True),
                layers.SimpleRNN(79, activation="relu"), return_sequences=True])
                layers.SimpleRNN(1, activation="sigmoid")