I want to calculate histograms from a file "a_test.dat" like this:
# block_1
x y
1 1979.2
1 1598.6
1 1430.8
1 1544.7
1 1953.3
1 1288.8
1 1741.3
1 1791.1
# block_2
x y
2 2027
2 1591.4
2 1401.6
2 1923.7
2 1677
2 1311.1
2 1689.3
2 2015.8
So I define x_min and x_max, the number of bins (n_bins) and calculate the width of a bin:
x_min = 1250
x_max = 2250
n_bins = 20
bin_width = 1.*(x_max - x_min)/n_bins
For the histogram I take an array that I initialize with zeros:
array histo[n_bins]
do for [i=1:n_bins] {
histo[i] = 0
}
Values cannot belong to two bins. So I take the interval including the left boundary and not including the right boundary like [a,b[ :
leftInclude(x) = ( _i=int( floor( column(x)/bin_width ) ), histo[_i] = histo[_i] + 1 )
The following command works as long as "column(x)/bin_width" is defined (not NaN). How can I check for invalid results from "column(x)/bin_width"?
data_file = "a_test.dat"
data_blocks = "block_1 block_2"
set table $Temp
plot for [i=1:words(data_blocks)] data_file index word(data_blocks, i) using ( leftInclude("y") ) w table
unset table
This looks like a XY-problem. Actually, you want to create and plot a histogram, but you are doing it in an "unusual" way and hence running into problems.
Unless there is a reason that you have to use arrays, I would use the (what I would consider) the intended way in gnuplot using smooth frequency
(check help smooth
). There should be tons of examples out there.
If you nevertheless want to do it via arrays, I'm sure there is a way to do it using valid()
, check help valid
.
Script:
### create histograms (smooth frequency)
reset session
$Data <<EOD
# block_1
x y
1 1979.2
1 1598.6
1 1430.8
1 1544.7
1 1953.3
1 1288.8
1 1741.3
1 1791.1
# block_2
x y
2 2027
2 1591.4
2 1401.6
2 1923.7
2 1677
2 1311.1
2 1689.3
2 2015.8
EOD
x_min = 1250
x_max = 2250
n_bins = 20
bin_width = 1.*(x_max - x_min)/n_bins
bin(col) = floor(column(col)/bin_width)*bin_width
set boxwidth bin_width
set style fill transparent solid 0.3
set xtics out
set key top left
set xrange[x_min-bin_width:x_max+bin_width]
data_blocks = "block_1 block_2"
set multiplot layout 2,1
do for [i=1:words(data_blocks)] {
plot $Data u (bin(2)):(1) index i-1 smooth freq \
w boxes lc i ti word(data_blocks,i) noenhanced
}
unset multiplot
### end of script
Result: