I want to bin some data according to some 'steps', here 1:10
. So bin{1}
should contain values >=steps(1) & <steps(2)
etc.
I'm wondering if I can get some tips/feedback from the community, put into a question: is there some common practice for binning data that I haven't found yet, can the code be improved in terms of efficiency and readability?
data=abs(sin(0:.1:10)*10); %example data
steps=1:10; %user-defined bins
betw=@(x,mi,ma) x(x>=mi & x<ma); %function that returns values between minimum/maximum
bin={};
for ind=1:numel(steps)-1
bin{ind}=betw(data,steps(ind),steps(ind+1));
end
bin
bin =
1×9 cell array
Columns 1 through 7
{1×7 double} {1×7 double} {1×7 double} {1×8 double} {1×9 double} {1×7 double} {1×10 double}
Columns 8 through 9
{1×11 double} {1×27 double}
The histcounts
function would be the "standard" way to do this:
data = abs(sin(0:.1:10)*10); %example data
steps = 1:10; %user-defined bins
hc = histcounts( data, steps );
>> hc =
[ 7 7 7 8 9 7 10 11 27 ]
Note that hc
is one element smaller than steps
because steps
defines the bin edges. The total counts sum(hc)
is equal to the number of elements in data
which fell between the lowest and highest bins - in this case fewer than numel(data)
because some elements of data
are lower-valued than your lowest bin in steps
.
There are many options within histcounts
to return the bin edges, specify number of bins rather than edges, return the bin number for each element, etc...
If all you actually want is the bar plot (noted in your comment), you can use histogram
, which calls histcounts
under-the-hood for the computation, but outputs a figure too.
histogram( data, steps );