Search code examples
pythontensorflowneural-networkactivation-functiondata-layers

Confusion about neural networks activation functions


I followed a tutorial about an image classifier using Python and Tensorflow.

I'm now trying to apply deep learning to a custom situation. I made a simulation program of sellers/buyers where the customers buy a stone following its wishes. The stones have a color, a size and a percentage of curve. The nearest of the customer wished values the stone is, the more the customer is able to pay. For the seller, the rarest the stone is, the higher the price should be. Then the program generates 100.000 purchases of a stone to feed a neural network which will try to beat others sellers. The dataset is looking like that :

dataset

I'm now trying to create my neural network. In the tutorial, he is using two Conv2D layers with a relu activation function and a MaxPooling2D, then a Flatten layer, a Dense layer and finally another Dense layer with a sigmoid activation function.

After reading some documentation, I found that the Conv2D layer is for a matrix but my data is already flat, so I prefer to use only Dense layers.

My first question is : does my neural network need a dense layer with a relu function like that :

model.add(Dense(64, activation='relu', input_dim(3)))

If my program generates only positives values ?

My second question is : does my neural network need a sigmoid function if I already normalized my data to make them between 0 and 1 by dividing them like this ? :

X[:,0] /= 256.0
X[:,1] /= 50.0
X[:,2] /= 100.0

These values are the max value of each column. So do I need a sigmoid function ?

Actually my neural network looks like this :

model = Sequential()
model.add(Dense(64, activation='relu', input_dim(3)))
model.add(Dense(64, activation='relu'))
model.add(Dense(1,  activation='sigmoid'))

But I'm confused about the efficientness of my model. Does my neural network could work ? If not, what kind of layers and activation functions I have to use ?


Solution

  • My first question is : does my neural network need a dense layer with a relu function like that :

    Yes. Your network requires the ReLUs even if your data is only positive. The idea of ReLUs (and activation functions in general) is that they add a certain complexity, such that the classifier may learn to generalize.

    Consider a CNN that takes images as inputs. The input data here consists of only positive values as well ( [0-1] or [0-255]) and they usually have many and many layers with the ReLU nonlinearity.

    If my program generates only positives values ?

    Your confusion is that your actual input-output relationship produces only positive values, but your classifier still contains weights that can be negative, so your layer outputs could still be negative otherwise.

    Also, if you were not to have any nonlinearities like ReLU, there would be no point in having multiple layers, as they would add no complexity to your classifier.

    second question is : does my neural network need a sigmoid function if I already normalized my data to make them between 0 and 1 by dividing them like this?

    Yes. You also need the sigmoid. Same reasoning as above. Your data may be positive, but your output layer would still be able to produce negative values or otherwise values outside your expected range.

    Having a linear output activation function would make learning nearly impossible, especially if your output range is within [0,1].