Search code examples
pythonnumpytensorflowkeraslstm

tensorflow-Keras LSTM VAE - Cannot convert a symbolic Tensor error on RHEL7 - Airflow


I am having the error

{taskinstance.py:1455} ERROR - Cannot convert a symbolic Tensor (lstm_4/strided_slice:0) to a numpy array. This error may indicate that you're trying to pass a Tensor to a NumPy call, which is not supported

Traceback (most recent call last)

when I create my LSTM-VAE model using the code below.

Configuration:

Python: 3.7.9
Tensorflow: 2.4.0
NumPy: 1.18.5

Odd thing is, the same code and configuration runs fine in Windows(also Windows server) but results in error in RHEL7. (I am working on Airflow) I tried upgrading to numpy 1.19.5 and tensorflow 2.4.1 with no result.

# Encoder
input_x = tensorflow.keras.layers.Input(
    shape=(time_steps, number_of_features)
)
encoder_lstm_int = tensorflow.keras.layers.LSTM(
    int_dim, return_sequences=True
)(input_x)
encoder_lstm_latent = tensorflow.keras.layers.LSTM(
    latent_dim, return_sequences=False
)(encoder_lstm_int)

z_mean = tensorflow.keras.layers.Dense(latent_dim)(encoder_lstm_latent)
z_log_sigma = tensorflow.keras.layers.Dense(latent_dim)(
    encoder_lstm_latent
)
z_encoder_output = _Sampling()([z_mean, z_log_sigma])

encoder: tensorflow.keras.models.Model = tensorflow.keras.models.Model(
    input_x, [z_mean, z_log_sigma, z_encoder_output]
)

# Decoder
decoder_input = tensorflow.keras.layers.Input(shape=(latent_dim))
decoder_repeated = tensorflow.keras.layers.RepeatVector(time_steps)(
    decoder_input
)
decoder_lstm_int = tensorflow.keras.layers.LSTM(
    int_dim, return_sequences=True
)(decoder_repeated)
decoder_lstm = tensorflow.keras.layers.LSTM(
    number_of_features, return_sequences=True
)(decoder_lstm_int)
decoder_dense1 = tensorflow.keras.layers.TimeDistributed(
    tensorflow.keras.layers.Dense(number_of_features * 2)
)(decoder_lstm)
decoder_output = tensorflow.keras.layers.TimeDistributed(
    tensorflow.keras.layers.Dense(number_of_features)
)(decoder_dense1)
decoder: tensorflow.keras.models.Model = tensorflow.keras.models.Model(
    decoder_input, decoder_output
)

# VAE
output = decoder(
    encoder(input_x)[2]
)  # this is the part encoder and decoder are connected together. Decoder
# takes the encoder output's[2] as input
lstm_vae: tensorflow.keras.models.Model = tensorflow.keras.models.Model(
    input_x, output, name='lstm_vae'
)

# Loss
rec_loss = (
    tensorflow.keras.backend.mean(
        tensorflow.keras.losses.mse(input_x, output)
    )
    * number_of_features
)
kl_loss = -0.5 * tensorflow.keras.backend.mean(
    1
    + z_log_sigma
    - tensorflow.keras.backend.square(z_mean)
    - tensorflow.keras.backend.exp(z_log_sigma)
)
vae_loss = rec_loss + kl_loss

lstm_vae.add_loss(vae_loss)
lstm_vae.compile(optimizer='adam', loss='mean_squared_error')

return encoder, decoder, lstm_vae

class _Sampling(tensorflow.keras.layers.Layer):
"""Sampling for encoder output."""

@staticmethod
def call(args):
    """
    Does sampling from the learned mu, std latent space for Decoder.
    """
    z_mean, z_log_sigma = args
    batch_size = tensorflow.shape(z_mean)[0]
    latent_dim = tensorflow.shape(z_mean)[1]
    epsilon = tensorflow.keras.backend.random_normal(
        shape=(batch_size, latent_dim), mean=0, stddev=1
    )
    return z_mean + tensorflow.keras.backend.exp(z_log_sigma / 2) * epsilon

Similar issues exist in stackoverflow where people used NumPy array as part of Tensor operations but I don't have any NumPy array or NumPy operation in my model either. Another solution was downgrading NumPy from 1.20 to 1.18 but that's already my version. So I am clueless right now.


Solution

  • Answering my own question: This occured only because of NumPy 1.20. Even though I downgraded to NumPy 1.18.5 I still got the error because Airflow somehow cached(either in memory or in airflow/.local) previous installation of NumPy(1.20) and used it despite pip listing 1.18.5, so I had to remove numpy in airflow's .local environment and rebooted the machine and this was resolved.