Search code examples
pythonmatplotlibentropy

Matplotlib mlab entropy calculation incorrect?


I have been using the matplotlib.mlab.entropy function and have noticed a possible error in the mlab code.

The documentation states that the calculation is:

Entropy formula
(source: matplotlib.org)
.

Note the log to base two. This is correct according to the usual definition of entropy.

However, in the mlab.py source, the calculation is using the natural logarithm:

S = -1.0 * np.sum(p * np.log(p)) + np.log(delta)

Surely that should be np.log2()?

I have tried the calculation myself using a couple of other methods (this, for example). I have copied and modified the mlab function and made it consistent with the others by changing np.log to np.log2.

So it looks to me that matplotlib.mlab.entropy is incorrect. Or am I missing something?


Solution

  • The documentation is incorrect, as confirmed by @user333700.

    Following advice on the matplotlib-users mailing list, I have submitted a pull request to fix the documentation.