Search code examples
pythonpytorchprobability

Pytorch log normalisation calculation


I do not understand how Pytorch does Log normalisation and searching around I can not find a good example/explanation. Can someone provide an explanation please?

eg

input_tensor =torch.tensor([0.6,0.4])
m = dist.Categorical(logits = input_tensor)
print(np.log(input_tensor))
print(m.logits)

gives: 
tensor([-0.5981, -0.7981])
tensor([-0.5108, -0.9163])

My probabilities sum to one so there is nothing to normalise yet Pytorch is transforming my input.

The documentation for Pytorch says:

The logits argument will be interpreted as unnormalized log probabilities and can therefore be any real number. It will likewise be normalized so that the resulting probabilities sum to 1 along the last dimension. logits will return this normalized value.

https://pytorch.org/docs/stable/distributions.html


Solution

  • The np.log function calculates the value by the expression

    enter image description here,

    Hence, if the input value is [0.6, 0.4], the resulting output would be tensor([-0.5108, -0.9163]).

    When dealing with Categorical, you can explore the PyTorch categorical.py source code here: https://github.com/pytorch/pytorch/blob/main/torch/distributions/categorical.py).

    You would find a line of code:

    self.logits = logits - logits.logsumexp(dim=-1, keepdim=True)
    

    Considering your input tensor as [0.6, 0.4], so the value of logits.logsumexp(dim=-1, keepdim=True) is

    enter image description here

    Therefore, that is why m.logits yields tensor([-0.5981, -0.7981]).