Search code examples
tensorflowkerastensorflow-datasets

How to intercept and feed intra-layer output as target data


Sometimes we need to preprocess the data by feeding them through preprocessing layers. This becomes problematic when your model is an autoencoder, in which case the input is both the x and the y.

Correct me if I'm wrong, and perhaps there's other ways around this, but it seems obvious to me that if the true input is, say, [1,2,3], and I scale this to 0 and 1: [0,0.5,1], then the model should be evaluating the autoencoder based on x=[0,0.5,1] and y=[0,0.5,1] rather than x=[1,2,3]. So if my model is, for example:

[
  preprocess_1,
  preprocess_2,
  encode1,
  ...
  decode1
]

How do I go about feeding the output of [preprocess_1, preprocess_2] to my model as y?


One obvious solution is to do the preprocessing before the model (for example, in tft's preprocessing_fn), but that has it downsides. For example, I'm guessing that doing this outside of the model means you won't get the full advantage of whatever GPU/accelerator the model has access to. Another reason would be that the preprocessing package provides better support. Doing this outside of the preprocessing layers means you'd have to write some of the code yourself.


Solution

  • You simply have to modify your loss function in order to minimize the difference between predictions and scaled inputs.

    This can be done using model.add_loss.

    Considering a dummy reconstruction task, where we have to reconstruct this data:

    X = np.random.uniform(0,255, (300,10))
    

    using this autoencoder:

    inp = Input(shape=(10,))
    prepocess_inp = Rescaling(scale=1./255)(inp)  # PREPROCESSING LAYER
    x = Dense(64)(prepocess_inp)
    x = Dense(16)(x)
    x = Dense(64)(x)
    out = Dense(10)(x)
    model = Model(inputs=inp, outputs=out)
    

    You have to manually add the loss in this way, in order to take into account the preprocessing applied on the input data:

    model.add_loss( tf.keras.losses.mean_absolute_error( prepocess_inp, out ) )
    model.compile(optimizer='adam', loss=None)
    

    The training is computed as usual:

    model.fit(X, X, epochs=10, batch_size = 32)