BCEWithLogitsLoss: Trying to get binary output for predicted label as a tensor, confused with output layer

Each element of my dataset has a multi-label tensor like [1, 0, 0, 1] with varying combinations of 1's and 0's. In this scenario, since I have 4 tensors, I have the output layer of my neural network to be 4. In doing so with BCEWithLogitsLoss, I obtain an output tensor like [3, 2, 0, 0] when I call model(inputs) which is in the range of (0, 3) as I specified with the output layer to have 4 output neurons. This does not match the format of what the target is expected to be, although when I change the number of output neurons to 2, I get a shape mismatch error. What needs to be done to fix this?

Solution

When using BCEWithLogitsLoss you make a 1D prediction per output binary label.
In your example, you have 4 binary labels to predict, and therefore, your model outputs 4d vector, each entry represents the prediction of one of the binary labels.

Using BCEWithLogitsLoss you implicitly apply Sigmoid to your outputs:

This loss combines a Sigmoid layer and the BCELoss in one single class.

Therefore, if you want to get the predicted probabilities of your model, you need to add a torch.sigmoid on top of your prediction. The sigmoid function will convert your predicted logits to probabilities.