Search code examples
bashcsvgnuplot

Bashscript with Gnuplot on .csv File


I would like to write a little bash script, which simply asks the user to enter a Country and a Item. The script shall open gnuplot and plot some data (the Order Date and Total Profit for the Country and Item) from a .csv and safe in .png

These are the first few lines (100k in total):

Country,Item Type,Order Date,Total Profit
Afghanistan,Baby Food,1/10/2010,672649.62
Afghanistan,Baby Food,1/23/2010,833406.84
Afghanistan,Baby Food,10/14/2011,893894.50
Afghanistan,Baby Food,10/20/2013,38919.16
Afghanistan,Baby Food,11/10/2015,51189.24
Afghanistan,Baby Food,11/19/2014,471247.76

My input in gnuplot terminal:

set title 'sales'
set xlabel 'date'
set ylabel 'profit'
set datafile separator ','
set xdata time
set timefmt "%m/%d/%y"
plot 100k.csv u 3:4 w lines

If I use this manual in the Gnuplot Terminal, it plots the wanted data. The problem is that it doesn't ignore the "headline" (Country,Item Type,Order Date,Total Profit)


Solution

  • It's not fully clear to me whether you are looking for a bash-script or a gnuplot-script, probably both. What I understand is that you want to make a call from the command line to generate some plots. So, you need to filter data from a CSV file and generate and save a plot to disk.

    With the following suggestion you call gnuplot from the command line via:

    gnuplot -c "SO70774684.gp" "Afghanistan" "Baby Food"
    

    and gnuplot is doing all for you, i.e. filtering and generation of the PNG graph.

    I assume that the entries are sorted by date, otherwise a line plot doesn't make too much sense or data needs to be sorted first. You are not specifying too many details, hence there is a lot of room for adjustments.

    Data: 'SO70774684.csv' with some more lines added.

    Country,     Item Type, Order Date, Total Profit
    Afghanistan, Baby Food,  1/10/2010, 672649.62
    Belgium,     Baby Food,  1/10/2010, 1111
    Afghanistan, Baby Food,  1/23/2010, 833406.84
    Cyprus,      Baby Food,  1/23/2010, 2222
    Afghanistan, Baby Food, 10/14/2011, 893894.50
    Denmark,     Baby Food, 10/14/2011, 3333
    Afghanistan, Baby Food, 10/20/2013, 38919.16
    Estonia,     Baby Food, 10/14/2011, 4444
    Afghanistan, Baby Food, 11/19/2014, 471247.76
    France,      Baby Food, 10/14/2011, 5555
    Afghanistan, Baby Food, 11/10/2015, 51189.24
    Germany,     Baby Food, 10/14/2011, 6666
    Belgium,     Baby Food,  1/10/2010, 1000
    Cyprus,      Baby Food,  1/23/2010, 2000
    Denmark,     Baby Food, 10/14/2011, 3000
    Estonia,     Baby Food, 10/14/2011, 4000
    France,      Baby Food, 10/14/2011, 5000
    Germany,     Baby Food, 10/14/2011, 6000
    

    Code: (gnuplot script file 'SO70774684.gp')

    ### create gnuplot graph from command line call
    reset session
    
    FILE    = 'SO70774684.csv'
    Country = ARG1
    Item    = ARG2
    
    set datafile separator comma
    set xdata time
    set timefmt "%m/%d/%Y"
    
    set term pngcairo size 640,384 font ",10"
    set output sprintf("%s,%s.png",Country,Item)
    
    set ylabel "Total profit" font ",11"
    set format x "%m\n%Y"
    set boxwidth 1.0 relative
    set style fill solid 0.3
    set key noautotitle
    set grid x,y
    
    set title sprintf("%s: %s", Country, Item)
    
    myFilter(colD,colF1,F1,colF2,F2) = strcol(colF1) eq F1 && strcol(colF2) eq F2 ? column(colD) : NaN
    set datafile missing NaN
    
    plot FILE u 3:(myFilter(4,1,Country,2,Item)) w lp pt 7 lc "red"
    
    set output
    ### end of code
    

    Result: (File: 'Afghanistan,Baby Food.png')

    enter image description here