Here's my input file :
1.37987
1.21448
0.624999
1.28966
1.77084
1.088
1.41667
I would like to create bins of a size of my choice to get histogram-like output, e.g. something like this for 0.1 bins, starting from 0 :
0 0.1 0
...
0.5 0.6 0
0.6 0.7 1
...
1.0 1.1 1
1.1 1.2 0
1.2 1.3 2
1.3 1.4 1
...
My file is too big for R, so I'm looking for an awk solution (also open to anything else that I can understand, as I'm still a Linux beginner).
This was sort of already answered in this post : awk histogram in buckets but the solution is not working for me.
This is also possible :
awk -v size=0.1
'{ b=int($1/size); a[b]++; bmax=b>bmax?b:bmax; bmin=b<bmin?b:bmin }
END { for(i=bmin;i<=bmax;++i) print i*size,(i+1)*size,a[i] }' <file>
It essentially does the same as the solution of EdMorton, but starts printing buckets from the minimum value which is default 0
. It essentially takes negative numbers into account.