Each element of my dataset has a multi-label tensor like [1, 0, 0, 1]
with varying combinations of 1's
and 0's
. In this scenario, since I have 4 tensors, I have the output layer of my neural network to be 4. In doing so with BCEWithLogitsLoss, I obtain an output tensor like [3, 2, 0, 0]
when I call model(inputs) which is in the range of (0, 3) as I specified with the output layer to have 4 output neurons. This does not match the format of what the target is expected to be, although when I change the number of output neurons to 2, I get a shape mismatch error. What needs to be done to fix this?
When using BCEWithLogitsLoss
you make a 1D prediction per output binary label.
In your example, you have 4 binary labels to predict, and therefore, your model outputs 4d vector, each entry represents the prediction of one of the binary labels.
Using BCEWithLogitsLoss
you implicitly apply Sigmoid to your outputs:
This loss combines a Sigmoid layer and the BCELoss in one single class.
Therefore, if you want to get the predicted probabilities of your model, you need to add a torch.sigmoid
on top of your prediction. The sigmoid
function will convert your predicted logits to probabilities.