Example:
If a have a variable X=[1 2 2 0], what's the correct way of calculating the entropy?
My attempt (using MATLAB):
p(1) = 1/4; % probably of occur 1
p(2) = 2/4; % probably of occur 2
p(0) = 1/4; % probably of occur 0
H = -(1/4*log2(1/4)+2/4*log2(2/4)+1/4*log2(1/4))
= 1.5
The problem and my confusion is, should I consider the zero values of X?
Using the entropy
function of MATLAB I get the same value.
Thank you.
The answer to your question depends on what you are attempting to do.
If X
represents the data associated to a greyscale image, then the entropy function is what you are looking for:
X = [1 2 2 0];
H = entropy(X); % 0.811278124459133
But neither your X
variable, nor your expected result (1.5
) point to that solution. To me, it seems like you are just attempting to calculate the Shannon's entropy on a vector of random values. Hence, you must use a different approach:
X = [1 2 2 0];
% Build the probabilities vector according to X...
X_uni = unique(X);
X_uni_size = numel(X_uni);
P = zeros(X_uni_size,1);
for i = 1:X_uni_size
P(i) = sum(X == X_uni(i));
end
P = P ./ numel(X);
% Compute the Shannon's Entropy
H = -sum(P .* log2(P)); % 1.5
P
must sum to 1
and probabilities (not values) equal to zero must be excluded to the computation (using the code above , it's not possible to produce such probabilities so it's not necessary to handle them).
Why are the results different? That's very simple to explain. In the first example (the one that uses the entropy function), Matlab is forced to treat X
as a grayscale image (a matrix whose values are either between 0
and 1
or ranging from 0
to 255
). Since the underlying type of X
is double
, the variable is internally transformed by the function im2uint8
so that all his values fall within the correct range of a greyscale image... thus obtaining:
X = [255 255 255 0];
This produces a different vector of probabilities, equal to:
P = [0.25 0.75];
that produces a Shannon's entropy index equal to 0.811278124459133
.