Search code examples

Building a non linear model with ReLUs in TensorFlow

I'm trying to build a simple non-linear model in TensorFlow. I have created this sample data:

x_data = np.arange(-100, 100).astype(np.float32)
y_data = np.abs(x_data + 20.) 

enter image description here

I guess this shape should be easily reconstructed using a couple of ReLUs, but I can't figure out how.

So far, my approach is to wrap linear components with ReLUs, but this doesn't run:

W1 = tf.Variable(tf.random_uniform([1], -1.0, 1.0))
W2 = tf.Variable(tf.random_uniform([1], -1.0, 1.0))
b1 = tf.Variable(tf.zeros([1]))
b2 = tf.Variable(tf.zeros([1]))

y = tf.nn.relu(W1 * x_data + b1) + tf.nn.relu(W2 * x_data + b2)

Any ideas about how to express this model using ReLUs in TensorFlow?


  • I think you're asking how to combine ReLUs in a working model? Two options are shown below:

    Option 1) Input of ReLU1 into ReLU2

    This is probably the preferred method. Note that r1 is the input to r2.

    x = tf.placeholder('float', shape=[None, 1])
    y_ = tf.placeholder('float', shape=[None, 1])
    W1 = weight_variable([1, hidden_units])
    b1 = bias_variable([hidden_units])
    r1 = tf.nn.relu(tf.matmul(x, W1) + b1)
    # Input of r1 into r2 (which is just y)
    W2 = weight_variable([hidden_units, 1])
    b2 = bias_variable([1])
    y = tf.nn.relu(tf.matmul(r1,W2)+b2) # ReLU2

    Option 2) Add ReLU1 and ReLU2

    Option 2 was listed in the original question, but I don't know if this is what you really below for a full working example and try it. I think you'll find it doesn't model well.

    x = tf.placeholder('float', shape=[None, 1])
    y_ = tf.placeholder('float', shape=[None, 1])
    W1 = weight_variable([1, hidden_units])
    b1 = bias_variable([hidden_units])
    r1 = tf.nn.relu(tf.matmul(x, W1) + b1)
    # Add r1 to r2 -- won't be able to reduce the error.
    W2 = weight_variable([1, hidden_units])
    b2 = bias_variable([hidden_units])
    r2 = tf.nn.relu(tf.matmul(x, W2) + b2)
    y = tf.add(r1,r2)  # Again, ReLU2 is just y

    Full Working Example

    Below is a full working example. By default it uses option 1, however, option 2 is also included in the comments.

     from __future__ import print_function
     import tensorflow as tf
     import numpy as np
     import matplotlib.pyplot as plt
     # Config the matlotlib backend as plotting inline in IPython
     %matplotlib inline
     episodes = 55
     batch_size = 5
     hidden_units = 10
     learning_rate = 1e-3
     def weight_variable(shape):
         initial = tf.truncated_normal(shape, stddev=0.1)
         return tf.Variable(initial)
     def bias_variable(shape):
         initial = tf.constant(0.1, shape=shape)
         return tf.Variable(initial)
     # Produce the data
     x_data = np.arange(-100, 100).astype(np.float32)
     y_data = np.abs(x_data + 20.)
     # Plot it.
     # Might want to randomize the data
     # np.random.shuffle(x_data)
     # y_data = np.abs(x_data + 20.)
     # reshape data ...
     x_data = x_data.reshape(200, 1)
     y_data = y_data.reshape(200, 1)
     # create placeholders to pass the data to the model
     x = tf.placeholder('float', shape=[None, 1])
     y_ = tf.placeholder('float', shape=[None, 1])
     W1 = weight_variable([1, hidden_units])
     b1 = bias_variable([hidden_units])
     r1 = tf.nn.relu(tf.matmul(x, W1) + b1)
     # Input of r1 into r2 (which is just y)
     W2 = weight_variable([hidden_units, 1])
     b2 = bias_variable([1])
     y = tf.nn.relu(tf.matmul(r1,W2)+b2) 
     # OPTION 2 
     # Add r1 to r2 -- won't be able to reduce the error.
     #W2 = weight_variable([1, hidden_units])
     #b2 = bias_variable([hidden_units])
     #r2 = tf.nn.relu(tf.matmul(x, W2) + b2)
     #y = tf.add(r1,r2)
     mean_square_error = tf.reduce_sum(tf.square(y-y_))
     training = tf.train.AdamOptimizer(learning_rate).minimize(mean_square_error)
     sess = tf.InteractiveSession()
     min_error = np.inf
     for _ in range(episodes):
         # iterrate trough every row (with batch size of 1)
         for i in range(x_data.shape[0]-batch_size+1):
             _, error =[training, mean_square_error],  feed_dict={x: x_data[i:i+batch_size], y_:y_data[i:i+batch_size]})
             if error < min_error :
                 min_error = error
                 if min_error < 3:
             #print(error, x_data[i:i+batch_size], y_data[i:i+batch_size])
     # error =[training, mean_square_error],  feed_dict={x: x_data[i:i+batch_size], y_:y_data[i:i+batch_size]})
     # if error != None:
     #    print(error)

    It might be easier to see in a jupiter notebook here