I do not understand how Pytorch does Log normalisation and searching around I can not find a good example/explanation. Can someone provide an explanation please?
input_tensor =torch.tensor([0.6,0.4])
m = dist.Categorical(logits = input_tensor)
tensor([-0.5981, -0.7981])
tensor([-0.5108, -0.9163])
My probabilities sum to one so there is nothing to normalise yet Pytorch is transforming my input.
The documentation for Pytorch says:
The logits argument will be interpreted as unnormalized log probabilities and can therefore be any real number. It will likewise be normalized so that the resulting probabilities sum to 1 along the last dimension. logits will return this normalized value.
The np.log
function calculates the value by the expression
Hence, if the input value is [0.6, 0.4]
, the resulting output would be tensor([-0.5108, -0.9163])
When dealing with Categorical
, you can explore the PyTorch categorical.py
source code here: https://github.com/pytorch/pytorch/blob/main/torch/distributions/categorical.py).
You would find a line of code:
self.logits = logits - logits.logsumexp(dim=-1, keepdim=True)
Considering your input tensor as [0.6, 0.4]
, so the value of logits.logsumexp(dim=-1, keepdim=True)
Therefore, that is why m.logits
yields tensor([-0.5981, -0.7981])