Search code examples
pythontensorflowgenerative-adversarial-network

why do i get nan loss value in training discriminator and generator of GAN?


I have saved my text vectors by using gensim library which consists of some negative numbers. will it effect the training? If not then why am i getting nan loss value first for discriminator and then for both discriminator and generator after certain steps of training?


Solution

  • There are several reasons for a NaN loss and why models diverge. Most common ones I've seen are:

    • Your learning rate is too high. If this is the case, the loss increases and then diverges to infinity.
    • You are getting a division by zero error. If this is the case, you can add a small number like 1e-8 to your output probability.
    • You have bad inputs. If this is the case, make sure that you do not feed your model with NaNs. i.e. use assert not np.any(np.isnan(x)) on the input data.
    • Your labels are not in the same domain of your objective function. If this is the case, check the range of your labels and make sure they match.

    If none of the above helps, try to check the activation function, the optimizer, the loss function, the size and the shape of the network.

    Finally, though less likely, there might be a bug with the framework you are using. Check the repo of the framework if there are others having the same issue.