Search code examples
gnuplot

How to build a 3D histogram in GNUPLOT


I have a data file (stat_data_raw.dat). I need to build such a diagram based on it.

sample chart made by Excel

Is it possible to do it this way in gnuplot? I broke my head while trying to do this in gnuplot. Such a chart as in the picture is built in Excel.

Data file: stat_data_raw.dat ( https://dropmefiles.com/oeO1L )

Build in Excel: pivot_and_chart.xlsx ( https://dropmefiles.com/xdoqy )


I've already tried a lot. I am using gnuplot v5.4 patchlevel 3. I have already visited a bunch of pages on the Internet, including the official website and Stack Overflow. But I could not find a suitable script for me to adapt for myself. Even though I know some algorithmic programming languages, the syntax of gnuplot seems strange and confusing to me. It is a pity that there is no visual editor in which one could build and set up a graph, and then export it to the gnuplot format.

I also visited the page http://gnuplot.sourceforge.net/demo_5.4/boxes3d.html many times, but the example given there is too simple. The data in the candlesticks.dat file is very primitive.

I tried to build a graph based on the stat_data_matrix.dat file (https://dropmefiles.com/omptl) The data is in the form of a matrix. I can prepare the input data in any format (in the form of a matrix or, as in the first case, flat data). I don't know how best to work with gnuplot.

The maximum that I got on the matrix data:

that's all I got

set terminal qt size 1920, 1080
set encoding utf8
set datafile separator '\t'
set xyplane 0
set boxwidth .3
set boxdepth .3
set cbrange [0.5:15]
unset key; unset colorbox
set view 44, 200
splot for [col = 2:30] 'c:\LOAD\GNUPLOT\stat_data_matrix.dat' u ($0):(col):(column(col)):(col):xtic(1) with boxes lc pal title columnhead

Solution

  • The examples on the gnuplot homepage are a starting point, but sometimes it requires a lot more (commands, experience and understanding) to get the desired graph with all "tricks" and "treats".

    I took your text data (not the matrix data). Your datafile separator is actually TAB, although it might be easier to keep separator whitespace (and not switch to TAB only, i.e. datafile separator tab). Hence, the first column is a time format, the second column is just "user", the 3rd column the user number, and the 4th column the z-value. I guess it is redundant to write 32 times "user" on the tic label, so I put it once into the y-label.

    Some more comments:

    • time in gnuplot is handled as seconds from 01.01.1970 00:00:00, that's why the boxwidth is 24*3600 = 1 day, and tic spacing says set xtics 24*3600.
    • this type of graphical representation is not optimal since (depending on the viewing angle) you might hide some data, e.g. data from user 11 is partly hidden by data from user 12. So, you could also play with the viewing angle to improve or maybe sort the users to avoid this.
    • the 3D-bars are now centered on the grid lines. If you want to have the grid lines at the edges of the bars (like in your Excel image) you have play some more "tricks".
    • Look at the example below which should be starting point for further tweaking.
    • Check help datafile separator, help timecolumn, help view, basically to every command you should find a help entry.

    Update:

    • Now with added labels of the z-value . Check plotting style with labels (check help labels).
    • grid lines are now at the border or the 3D-bars (earlier: centered) check help mxtics.
    • what I haven't found out yet is how to rotate the tic labels in a 3D plot. I asked a question about this and maybe there will be a better answer than mine which is placing and rotating the tic labels "manually".

    As mentioned earlier, depending on the data, hiding of data by higher 3D-bars in front probably cannot be avoided completely. In order to minimize this the users are sorted by highest average. For this, you need to implement a few more lines:

    • you plot the data to tables (check help table) using the options smooth unique and smooth zsort (check help smooth).
    • then you are (mis-)using stats (check help stats) to put the order of users into the variable USERS.
    • during plotting you are using the ternary operator (check help ternary) to filter the data and plot it in the sequence given by USERS.
    • to get the ytic labels right, you have to place the labels via ytic(...), check help xticlabels.

    Edit: if you set the tics via yticlabels(), you won't see the mytics anymore and the minor y-grid lines will not show up. Hence, you have to add the ytic labels manually (pretty annyoing)!

    As you can see, if you want to rearrange data a bit it can get pretty awkward in gnuplot, which is not easy to understand for a beginner. Well, gnuplot wants to be a plotting tool, not a data processing tool.

    Data: SO73521453.dat (TAB cannot be displayed here, but in the example below we are using whitespace anyway)

    Date    User    Count
    2022-07-29  User 1  53
    2022-07-29  User 2  3
    2022-07-29  User 3  1
    2022-07-29  User 4  2
    2022-07-29  User 5  1
    2022-07-29  User 6  5
    2022-07-29  User 7  1
    2022-07-29  User 8  1
    2022-07-30  User 1  2
    2022-07-30  User 2  2
    2022-07-30  User 6  1
    2022-07-30  User 9  1
    2022-07-31  User 1  1
    2022-07-31  User 10 1
    2022-08-01  User 1  37
    2022-08-01  User 2  1
    2022-08-01  User 11 1
    2022-08-01  User 3  2
    2022-08-01  User 4  4
    2022-08-01  User 5  1
    2022-08-01  User 6  7
    2022-08-01  User 9  1
    2022-08-01  User 8  1
    2022-08-01  User 12 12
    2022-08-02  User 1  40
    2022-08-02  User 3  2
    2022-08-02  User 4  13
    2022-08-02  User 5  1
    2022-08-02  User 6  6
    2022-08-02  User 10 1
    2022-08-02  User 12 11
    2022-08-03  User 1  25
    2022-08-03  User 2  5
    2022-08-03  User 13 4
    2022-08-03  User 3  4
    2022-08-03  User 14 2
    2022-08-03  User 4  10
    2022-08-03  User 5  1
    2022-08-03  User 6  5
    2022-08-03  User 15 1
    2022-08-03  User 12 2
    2022-08-04  User 1  81
    2022-08-04  User 3  1
    2022-08-04  User 14 1
    2022-08-04  User 4  2
    2022-08-04  User 5  1
    2022-08-04  User 6  1
    2022-08-04  User 16 2
    2022-08-04  User 17 2
    2022-08-04  User 10 1
    2022-08-04  User 18 1
    2022-08-04  User 12 6
    2022-08-05  User 1  40
    2022-08-05  User 14 2
    2022-08-05  User 4  3
    2022-08-05  User 6  3
    2022-08-05  User 9  3
    2022-08-05  User 10 1
    2022-08-05  User 19 1
    2022-08-05  User 15 1
    2022-08-05  User 18 4
    2022-08-05  User 12 17
    2022-08-06  User 1  1
    2022-08-07  User 1  1
    2022-08-07  User 12 2
    2022-08-08  User 1  30
    2022-08-08  User 13 8
    2022-08-08  User 3  3
    2022-08-08  User 4  12
    2022-08-08  User 5  3
    2022-08-08  User 6  3
    2022-08-08  User 20 2
    2022-08-08  User 12 19
    2022-08-08  User 21 1
    2022-08-09  User 1  51
    2022-08-09  User 11 2
    2022-08-09  User 13 6
    2022-08-09  User 4  4
    2022-08-09  User 6  5
    2022-08-09  User 22 1
    2022-08-09  User 12 12
    2022-08-09  User 21 1
    2022-08-09  User 23 1
    2022-08-10  User 1  61
    2022-08-10  User 2  2
    2022-08-10  User 13 2
    2022-08-10  User 4  2
    2022-08-10  User 6  1
    2022-08-10  User 24 1
    2022-08-10  User 25 1
    2022-08-10  User 15 1
    2022-08-10  User 12 10
    2022-08-10  User 21 1
    2022-08-11  User 1  27
    2022-08-11  User 2  4
    2022-08-11  User 13 2
    2022-08-11  User 14 2
    2022-08-11  User 4  2
    2022-08-11  User 5  1
    2022-08-11  User 6  7
    2022-08-11  User 26 1
    2022-08-11  User 12 16
    2022-08-12  User 1  23
    2022-08-12  User 11 1
    2022-08-12  User 13 7
    2022-08-12  User 3  1
    2022-08-12  User 4  1
    2022-08-12  User 6  11
    2022-08-12  User 20 1
    2022-08-12  User 10 1
    2022-08-12  User 12 4
    2022-08-13  User 11 2
    2022-08-14  User 1  2
    2022-08-15  User 1  59
    2022-08-15  User 2  3
    2022-08-15  User 13 5
    2022-08-15  User 3  2
    2022-08-15  User 14 1
    2022-08-15  User 4  3
    2022-08-15  User 5  1
    2022-08-15  User 6  9
    2022-08-15  User 24 1
    2022-08-15  User 26 1
    2022-08-15  User 27 1
    2022-08-15  User 28 1
    2022-08-15  User 12 6
    2022-08-15  User 23 2
    2022-08-16  User 1  53
    2022-08-16  User 11 1
    2022-08-16  User 13 2
    2022-08-16  User 3  1
    2022-08-16  User 6  2
    2022-08-16  User 24 1
    2022-08-16  User 9  1
    2022-08-16  User 12 7
    2022-08-17  User 1  58
    2022-08-17  User 11 2
    2022-08-17  User 13 2
    2022-08-17  User 3  2
    2022-08-17  User 4  3
    2022-08-17  User 5  1
    2022-08-17  User 6  9
    2022-08-17  User 10 1
    2022-08-17  User 29 1
    2022-08-17  User 12 23
    2022-08-17  User 21 1
    2022-08-18  User 1  54
    2022-08-18  User 2  3
    2022-08-18  User 11 1
    2022-08-18  User 13 5
    2022-08-18  User 3  1
    2022-08-18  User 5  1
    2022-08-18  User 6  2
    2022-08-18  User 28 1
    2022-08-18  User 8  1
    2022-08-18  User 12 17
    2022-08-19  User 1  64
    2022-08-19  User 2  1
    2022-08-19  User 13 2
    2022-08-19  User 3  1
    2022-08-19  User 6  5
    2022-08-19  User 24 2
    2022-08-19  User 9  1
    2022-08-19  User 8  1
    2022-08-19  User 12 2
    2022-08-20  User 1  1
    2022-08-20  User 11 2
    2022-08-20  User 6  2
    2022-08-21  User 1  2
    2022-08-21  User 6  3
    2022-08-22  User 1  60
    2022-08-22  User 2  2
    2022-08-22  User 11 1
    2022-08-22  User 13 7
    2022-08-22  User 3  1
    2022-08-22  User 5  1
    2022-08-22  User 6  10
    2022-08-22  User 28 1
    2022-08-22  User 8  1
    2022-08-22  User 12 16
    2022-08-22  User 23 1
    2022-08-23  User 1  31
    2022-08-23  User 2  1
    2022-08-23  User 13 4
    2022-08-23  User 14 1
    2022-08-23  User 6  3
    2022-08-23  User 16 1
    2022-08-23  User 18 1
    2022-08-23  User 12 15
    2022-08-24  User 1  50
    2022-08-24  User 13 2
    2022-08-24  User 3  3
    2022-08-24  User 14 1
    2022-08-24  User 5  2
    2022-08-24  User 6  5
    2022-08-24  User 9  1
    2022-08-24  User 28 1
    2022-08-24  User 12 3
    2022-08-25  User 1  32
    2022-08-25  User 11 1
    2022-08-25  User 13 4
    2022-08-25  User 30 1
    2022-08-25  User 5  2
    2022-08-25  User 6  4
    2022-08-25  User 16 1
    2022-08-25  User 9  1
    2022-08-25  User 15 1
    2022-08-25  User 12 24
    2022-08-26  User 1  11
    2022-08-26  User 2  1
    2022-08-26  User 13 8
    2022-08-26  User 5  1
    2022-08-26  User 6  7
    2022-08-26  User 31 14
    2022-08-26  User 28 1
    2022-08-26  User 12 2
    2022-08-27  User 2  2
    2022-08-27  User 5  1
    2022-08-27  User 31 2
    2022-08-27  User 9  2
    2022-08-27  User 28 1
    2022-08-28  User 28 1
    

    Script:

    ### 3D bars with labels and sorting
    reset session
    
    FILE = "SO73521453.dat"
    
    # sort users by highest average
    set table $Temp1
        plot FILE u 3:4 smooth unique        # get average for each user
    set table $Temp2
        plot $Temp1 u 1:2:(-$2) smooth zsort # sort by highest average
    unset table
    USERS = ''
    stats $Temp2 u (USERS=sprintf("%s %d",USERS,$1)) nooutput  # get user order in a string
    
    set datafile separator whitespace
    set format x "%b %d" timedate
    myBoxSize = 1.0
    set boxwidth 24*3600*myBoxSize
    set boxdepth myBoxSize
    set wall y0 fc "white"
    set wall x1 fc "white"
    set xyplane at 0
    set xtics 24*3600 out scale 0,1 font ",7" offset -1,0.2
    set mxtics 2
    set ytics 1 out scale 0,1 font ",7" offset 0,0.2
    set mytics 2
    set ylabel "user" rotate parallel font ",14" offset 0,1
    set grid mx,my,z lt 1 lc "grey"
    set key noautotitle
    set xrange [:] noextend reverse
    set yrange [0.5:words(USERS)+0.5] noextend
    set view 30,225
    set style textbox opaque fc rgb 0x77ffffff
    
    set format y ""
    set for [i=1:words(USERS)] label i word(USERS,i) at graph 0, first i, first 0 \
          offset 0,-0.5,0 rotate by 0 left font ",9"
    
    splot for [i=1:words(USERS)] FILE u (timecolumn(1,"%Y-%m-%d")):(i):\
            ($3==word(USERS,i)?$4:NaN):(i) w boxes lc var, \
          for [i=1:words(USERS)] '' u (timecolumn(1,"%Y-%m-%d")):(i):\
            ($3==word(USERS,i)?$4:NaN):4 w labels boxed font ",8"
    ### end of script
    

    Result: (certainly still to be optimized)

    enter image description here