Search code examples
pythonnumpytensorflowdeep-learningtensorflow-estimator

How to feed further model parameters along the inputs to Tensorflow Estimator?


Short Version: I have a custom model with a layer which can be parameterized by a value sigma.

I am using this model with Tensorflow Estimator:

    classifier = tf.estimator.Estimator(model_fn=model_fun)

And for instance the training is declared as follows:

    # Train the model
    train_input_fn=tf.estimator.inputs.numpy_input_fn(x={"x": train_data}, 
    batch_size=batch_size, num_epochs=nEpochs, shuffle=True)
    classifier.train(input_fn=train_input_fn,max_steps=nSteps)

In this setup, how can I pass sigma to my custom Estimator so that it can be trained / evaluated with different values?


Longer Version:

I am working on this DNN Autoencoder, for which I have a layer that adds Gaussian noise by considering random_normal() values from a distribution of known standard deviation sigma.

This is a model of a communication system, and the output logits (from the final layer) is retrieved using the model.predict() function, and my metric namely Bit Error rate is computed by a custom function in Tensor Flow 1.5, Python 3.5, Windows 10.

The problem is as follows:

  1. I want to train the system for sigma=sigma1, and retrieve the output logits.(This part is fine, and I am able to get the desired output.)

  2. I also want to predict the outputs for sigma=sigma2, sigma=sigma3, sigma=sigma4 and so on using the same Estimator that was defined(IN THE SAME PROGRAM).

My DNN looks like this and is defined in the model function:

  1. Input Layer- One Hot Encoded Values are fed here.

  2. Dense+ReLU

  3. Dense+Linear

  4. Normalization Layer

  5. Addition of Noise: Here I add a tf.random_normal(stddev=sigma) to the output of the previous layer. This is where I would appreciate assistance in understanding how I can use different sigmas for each run(train/test). I suppose you could say sigma ought to be a parameter that can have different values for each test run.

     gnoise=tf.random_normal(mean=0,stddev=sigma) 
     Then the layer's output=norm(which is the prev layer's output)+gnoise
    
  6. Dense+RELU

  7. Softmax-Output is Logits

I define my estimator as:

    classifier = tf.estimator.Estimator(model_fn=model_fun)

And the training is declared as follows:

    # Train the model
    train_input_fn=tf.estimator.inputs.numpy_input_fn(x={"x": train_data}, 
    batch_size=batch_size, num_epochs=nEpochs, shuffle=True)
    classifier.train(input_fn=train_input_fn,max_steps=nSteps)

The predict function is declared and called as:

    pred_input_fn=tf.estimator.inputs.numpy_input_fn(x={"x": test_data}, 
    batch_size=batch_size, num_epochs=nEpochs, shuffle=False)
    pred_results = classifier.predict(input_fn=pred_input_fn)

Solution

  • You should make your sigma a parameter of your layer, then feeds its value to your model at runtime through your features (distinguishing between x and sigma using the columns keys).

    It is hard to reply precisely without your gaussian layer code, but supposing your model is defined in such a way:

    import tensorflow as tf
    
    def gaussian_noise_layer(x):
        # Currently, your sigma is probably fixed somewhere here, e.g.
        sigma = 1
    
        dist = tf.distributions.Normal(loc=0., scale=sigma)
    
        # Build Gaussian kernel from dist:
        # gaussian_kernel = ...
    
        return tf.nn.depthwise_conv2d(x, gaussian_kernel, [1, 1, 1, 1], padding='SAME')
    
    def model_fun(features, labels, mode, params):
    
        # Get input x from features:
        x = tf.feature_column.input_layer(features, params.feature_column)
    
        # ... building your model here, adding at some point the gaussian layer, e.g.
        # ... net = f(x)
        net = gaussian_noise_layer(net)
        # ... predictions = f'(net)
        # ...
    
        return tf.estimator.EstimatorSpec(
            mode=mode,
            predictions=predictions,
            loss=loss,
            train_op=train_op,
            eval_metric_ops=eval_metric_ops
        )
    
    with tf.Session() as sess:
        # ...
    
        # Specifying your params and inputs:
        params = tf.contrib.training.HParams(
            # ... other hyperparameters,
            # Define feature column for input x of shape "shape_x" (e.g. (64, 64, 3)):
            feature_column=tf.feature_column.numeric_column(key="x", shape=shape_x)
        )
    
        classifier = tf.estimator.Estimator(model_fn=model_fun, params=params)
    
        # For training:
        train_input_fn = tf.estimator.inputs.numpy_input_fn(x={"x": train_data},
                                                            batch_size=batch_size, num_epochs=nEpochs, shuffle=True)
        classifier.train(input_fn=train_input_fn, max_steps=nSteps)
    

    ... then you would need to edit it like this:

    import tensorflow as tf
    
    def gaussian_noise_layer(x, sigma):
    
        # sigma is now a parameter
        dist = tf.distributions.Normal(loc=0., scale=sigma)
    
        # Build Gaussian kernel from dist:
        # gaussian_kernel = ...
    
        return tf.nn.depthwise_conv2d(x, gaussian_kernel, [1, 1, 1, 1], padding='SAME')
    
    def model_fun(features, labels, mode, params):
    
        # Get input x from features:
        x = tf.feature_column.input_layer(features, params.input_feature_column)
        # Get sigma from features:
        sigma = tf.feature_column.input_layer(features, params.sigma_feature_column)
    
        # ... building your model here, adding at some point the gaussian layer, e.g.
        # ... net = f(x)
        net = gaussian_noise_layer(net, sigma)
        # ... predictions = f'(net)
        # ...
    
        return tf.estimator.EstimatorSpec(
            mode=mode,
            predictions=predictions,
            loss=loss,
            train_op=train_op,
            eval_metric_ops=eval_metric_ops
        )
    
    with tf.Session() as sess:
        # ...
    
        # We now specify which columns contain the actual inputs (x), and which columns contain other parameters (sigma):
        params = tf.contrib.training.HParams(
            # ... other hyperparameters,
            # Define feature column for input x of shape "shape_x" (e.g. (64, 64, 3)):
            input_feature_column=tf.feature_column.numeric_column(key="x", shape=shape_x),
            # Define feature column for input sigma of shape () i.e. scalar (default shape):
            sigma_feature_column=tf.feature_column.numeric_column(key="sigma")
        )
    
        classifier = tf.estimator.Estimator(model_fn=model_fun, params=params)
    
        # Train:
        num_train_elements = train_data.shape[0]
        sigma = [1] * num_train_elements  # or e.g. sigma = [1, 1, 2, 1, 3, ...]
        # We can now feed sigma along x:
        train_input_fn = tf.estimator.inputs.numpy_input_fn(
             x={"x": train_data, "sigma": numpy.array(sigma)},
             batch_size=batch_size, num_epochs=nEpochs, shuffle=True)
    
        classifier.train(input_fn=train_input_fn, max_steps=nSteps)
    
        # ...
    
        # Predict:
        sigma = [2] * num_train_elements  # or e.g. sigma = [1, 1, 2, 1, 3, ...]
        pred_input_fn=tf.estimator.inputs.numpy_input_fn(
             x={"x": train_data, "sigma": numpy.array(sigma)},
             batch_size=batch_size, num_epochs=nEpochs, shuffle=True)
        pred_results = classifier.predict(input_fn=pred_input_fn)