Search code examples
matlabhistogrambins

Am I using histc wrong, or is this MATLAB's fault?


Ok, here's some code in MATLAB:

data = [1 1.5 2 3 4 4.5 5 6 7 7 7 0 0 0];

histc(data, [1:1:5])
histc(data, [1:1:5, inf])
histc(data, [-inf, 1:1:5])

which outputs the following:

ans = 2     1     1     2     1
ans = 2     1     1     2     5     0
ans = 3     2     1     1     2     1

My question is, why does MATLAB return a useless 0 when you use inf in the bin size (to mean >= 5 in this case)?

Won't it always be zero? The help says the output will always be the same length as the bin size, but isn't that a bad spec in this case?


Solution

  • That's actually the correct behavior of HISTC. When you use the syntax:

    n = histc(x,edges);
    

    then, from the documentation:

    n(k) counts the value x(i) if edges(k) <= x(i) < edges(k+1). The last bin counts any values of x that match edges(end).

    Therefore, the last edge value you give returns the count of how many things exactly match it. When inf is the last edge value, that counts 0 (i.e. there are no infs in the data). When 5 is the last edge value, it exactly matches 1 value in the data.