Output activation function for tensors for multilabel classification

My expected label is a tensor like tensor([[0, 1, 0, 1], [1, 1, 1, 0]]).

The output from my model (note that I stopped training just to obtain the values so they do not accurately represent the activations of neurons) is also a tensor like tensor([[-10.6964, -13.8998, 0.8348, -45.7040], [-10.3260, -13.8385, -9.2342, -5.3424]])

Am I wrong in using BCEWithLogitsLoss directly on the outputs and labels? Would I need to convert the output tensor to a binary one similar to the expected label prior to using BCEWithLogitsLoss? I understand that BCEWithLogitsLoss is just BCELoss + Sigmoid activation. How can I obtain values of the type of the expected label tensor and what loss should I use in that case?

Solution

In case of multi-label classification BCELoss is a common choice. BCEWithLogitsLoss works directly on the outputs and labels in the sense that it expects the logits as one input and the class (0 or 1) as the second input. You can find more details regarding this loss and the expected label tensor here: BCEWithLogitsLoss