This is the guide to make a custom estimator in TensorFlow: https://www.tensorflow.org/guide/custom_estimators
The hidden layers are made using tf.nn.relu
:
# Build the hidden layers, sized according to the 'hidden_units' param.
for units in params['hidden_units']:
net = tf.layers.dense(net, units=units, activation=tf.nn.relu)
I altered the example a bit to learn XOR, with hidden_units=[4]
and n_classes=2
. When the activation function is changed to tf.nn.sigmoid
, the example works as usual. Why is it so? Is it still giving correct result because XOR inputs are just zeros and ones?
Both functions give smooth loss curves converge to zero line.
About XOR problem, relu
solved a vanishing gradient that an error value by back propagation is vanished in deep hidden layers.
So, Sigmoid
works if you make just one hidden layer.
Sigmoid has a vlue in 0~1. An Error value by back propagation from output layer is going to be very small value at the far from output layer by a partial differential equation.
Blue line is Relu and Yellow line is Sigmoid.
Relu has x value if it is over than 0. So, Error value can be reached to 1st layer.