Search code examples
machine-learningtheano

How to fix the NAN or INF when we use cross entropy with theano?


We'll have to compute:

y*log(y_compute)+(1-y)*(1-y_compute)

so when we get y_compute 1. or 0.,this problem would show up. What should I do to avoid it?


Solution

  • Your expression y_compute maybe contains an exponential, e.g. coming from theano.tensor.nnet.sigmoid? In that case, it should usually never reach exact 0 or 1. In those cases you can then just use your expression or theano.tensor.nnet.crossentropy_categorical_1hot directly.

    If for whatever reason you have exact 0 and 1, another way is to clip the input to the crossentropy. Try e.g. replacing y_compute with theano.tensor.clip(y_compute, 0.001, 0.999), knowing that this will restrict the range of the logarithm.