In Gnuplot v5.4.2 I'd like to make a boxplots for several data columns from different files, combined in one multi-boxplot with the same y-value range.
What I can do is t combine the data into one file, however the columns do not have the same number of entries. Can I just specify NaN (not a number) or some other value and Gnuplot will ignore it? For example, when the combined file looks like:
"label 1" "label 2" "label 3"
12.3 44.2 13.3
12.4 12.5 14.4
11.6 13.7 NaN
NaN 15.7 NaN
I had a look at http://www.gnuplot.info/demo_5.2/boxplot.html but the labels in files like silver.dat are hard-coded in the gplot script, and all data has the same number of rows.
Thank you.
Edit:
Stitching together pieces from boxplot.html
,I now have this:
$DATA <<EOF
"label 1" "label 2" "label 3"
11.3 14.2 11.3
12.3 44.2 16.3
1.4 12.5 14.4
16.4 17.5 17.4
17.4 12.5 14.4
12.4 12.5 14.4
11.6 13.7 NaN
NaN 15.7 NaN
EOF
set terminal png size 400,300;
set output "box.png";
set key autotitle columnhead
factors = "\"label 1\" \"label 2\" \"label 3\""
NF = words(factors)
# No legend
unset key
# Solid box-and-whiskers where the whiskers extend 0%...100%
set style data boxplot
set style fill solid 0.5 border -1
set style boxplot fraction 1
set boxwidth 0.6
set xtic ("" 0)
set for [i=1:NF] xtics add (word(factors,i) i)
plot $DATA using (1):1, '' using (2):2, '' using (3):3 ;
Which produces:
what's pretty close to what I am after. For multiple files, I would use something like:
...
set xtics add ("label 1" 1)
set xtics add ("label 2" 2)
set yrange[-0.4 : *]
set xrange[0.5 : 2.5]
plot "file1.data" using (1):(23+$5) , \
"file2.data" using (2):(23+$5);
Assuming I understood your question correctly, you want to plot boxplots from different files and different columns in one graph.
Somehow you have to specify the filenames and columns, e.g. in a string(list). So far, I haven't succeeded to use the xticlabels together with plotting style boxplot
. So, the files is "plotted" a second time (actually, NaN
is plotted, i.e. nothing) in order to get the corresponding columnheaders.
For further reading, check help boxplot
, help word
, help words
, help xticlabels
, help columnhead
.
edit: added mean values to graph
stats
to calculate the mean valueCheck help stats
, help arrays
, help vectors
.
Data:
SO78599002_1.dat
# SO78599002_1
File1Col1 File1Col2 File1Col3 File1Col4
1 12 13 14
2 22 23 24
3 32 33 34
4 42 43 44
5 72 73 74
6 82 83 84
7 92 93 94
SO78599002_2.dat
# SO78599002_2
File2Col1 File2Col2 File2Col3
1 12 13
2 22 23
3 32 33
4 72 73
5 82 83
SO78599002_3.dat
# SO78599002_3
File3Col1 File3Col2
1 12
2 22
3 32
4 42
5 62
6 72
Script:
### boxplots from different files and selected columns plus mean values
reset session
FILES = "SO78599002_1.dat SO78599002_2.dat SO78599002_3.dat"
COLUMNS = "4 3 2"
N = words(FILES)
File(i) = word(FILES,i)
Col(i) = int(word(COLUMNS,i))
set style fill solid 0.3
set key noautotitle
array MEANS[N] # set array size
do for [i=1:N] {
stats File(i) u Col(i) nooutput
MEANS[i] = STATS_mean
}
plot for [i=1:N] File(i) u (i):Col(i) w boxplot, \
for [i=1:N] File(i) u (i):(NaN):xtic(columnhead(Col(i))) w p, \
MEANS u ($0+1-0.25):2:(0.5):(0) w vec dt 2 lc "black" nohead ti "mean value"
### end of script
Result: