Search code examples
pythonmatplotlibgnuplotarmadillo

Plotting large text file containing a matrix with gnuplot/matplotlib


For debugging purposes my program writes out the armadillo-based matrices in a raw-ascii format into text files, i.e. complex numbers are written as (1, 1). Moreover, the resulting matrices result in file sizes > 3 GByte.

I would like to "plot" those matrices (representing fields) such that I can look at different points within the field for debugging. What would be the best way of doing that?

When directly plotting my file with gnuplot using

plot "matrix_file.txt" matrix with image

I get the response

warning: matrix contains missing or undefined values
Warning: empty cb range [0:0], adjusting to [-1:1]

I also could use Matplotlib, iterate over each row in the file and convert the values into appropriate python values, but I assume reading the full file doing that will be rather time-consuming.

Thus, are there other reasonable fast options for plotting my matrix, or is there a way to tell gnuplot how to treat my complex numbers properly?

A part of the first line looks like

(0.0000000000000000e+00,0.0000000000000000e+00) (8.6305562282169946e-07,6.0526580514090297e-07) (1.2822974500623326e-05,1.1477679031930141e-05) (5.8656372718492336e-05,6.6626342814082442e-05) (1.6183121649896915e-04,2.3519364967920469e-04) (3.2919257507746272e-04,6.2745022681547850e-04) (5.3056616247733281e-04,1.3949688132772061e-03) (6.7714688179733437e-04,2.7240206117506108e-03) (6.0083005524875425e-04,4.8217990806492588e-03) (3.6759450038482363e-05,7.8957232784174231e-03) (-1.3887302495780910e-03,1.2126758313515496e-02) (-4.1629396217170980e-03,1.7638346107957101e-02) (-8.8831593853181175e-03,2.4463072133103888e-02) (-1.6244140097742808e-02,3.2509486873735290e-02) (-2.7017231109227786e-02,4.1531431496659221e-02) (-4.2022691198292300e-02,5.1101686500864850e-02) (-6.2097364532786636e-02,6.0590740956970250e-02) (-8.8060067117896060e-02,6.9150058884242055e-02) (-1.2067637255414780e-01,7.5697648270160053e-02) (-1.6062285417043359e-01,7.8902435158400494e-02) (-2.0844826713055306e-01,7.7163461035715558e-02) (-2.6452596415873003e-01,6.8580842184681204e-02) (-3.2898869195273894e-01,5.0918234150147214e-02) (-4.0163477687695504e-01,2.1561405580661022e-02) (-4.8179470918233597e-01,-2.2515842273449008e-02) (-5.6815035401912617e-01,-8.4759639628930100e-02) (-6.5850621484774385e-01,-1.6899215347429869e-01) (-7.4952345707877654e-01,-2.7928561041518252e-01) (-8.3644196044174313e-01,-4.1972419090890900e-01) (-9.1283160402230334e-01,-5.9403043419268908e-01) (-9.7042844114238713e-01,-8.0504703287094281e-01) (-9.9912107865273936e-01,-1.0540865412492695e+00) (-9.8715384989307420e-01,-1.3401890190155983e+00) (-9.2160320921981831e-01,-1.6593576679224276e+00) (-7.8916051033438095e-01,-2.0038702251062159e+00) (-5.7721850912406181e-01,-2.3617835609973805e+00) (-2.7521347260072193e-01,-2.7167550691449942e+00)

Ideally, I would like to be able to choose if I plot only the real part, the imaginary part or the abs()-value.


Solution

  • Here is a gnuplot only version. Actually, I haven't seen (yet) a gnuplot example about how to plot complex numbers from a datafile. Here, the idea is to split the data into columns at the characters ( and , and ) via:

    set datafile separator '(,)'
    

    Then you can address your i-th real and imaginary parts in column via column(3*i-1) and column(3*i), respectively.

    You are creating a new dataset via plotting the data many times in a double loop, which is ok for small data. However, my guess would be that this solution might become pretty slow for large datasets, especially if you are plotting from a file. I assume if you have your data once in a datablock (instead of a file) it might be faster. Check gnuplot: load datafile 1:1 into datablock. In general, maybe it is more efficient to use another tool, e.g. Python, awk, etc. to prepare the data.

    Just a thought: if you have approx. 3e9 Bytes of data and (according to your example) approx. 48-50 Bytes per datapoint and if you want to plot it as a square graph, then the number of pixels on a side would be sqrt(3e9/50)=7746 pixels. I doubt that you have a display which can display this at once.

    Edit:

    The modified version below is now using set print to datablock and is much faster then the original version (using a double loop of plot ... every ...). The speed improvement I can already see with my little data example. Good luck with your huge dataset ;-). Just for reference and comparison, the old version listed again here:

    # create a new datablock with row,col,Real,Imag,Abs
    # using plot ...with table     (pretty slow and inefficient)
    set table $Data2
        set datafile separator '(,)'          # now, split your data at these characters
        myReal(i) = column(3*i-1)
        myImag(i) = column(3*i)
        myAbs(i)  = sqrt(myReal(i)**2 + myImag(i)**2)
        plot for [row=0:rowMax-1] for [col=1:colMax] $Data u (row):(col):(myReal(col)):(myImag(col)):(myAbs(col)) every ::row::row w table
        set datafile separator whitespace     # set separator back to whitespace
    unset table
    

    Code: (modified using set print)

    ### plotting complex numbers
    reset session
    
    $Data <<EOD
    (0.1,0.1)   (0.2,1.2)   (0.3,2.3)   (0.4,3.4)   (0.5,4.5)
    (1.1,0.1)   (1.2,1.2)   (1.3,2.3)   (1.4,3.4)   (1.5,4.5)
    (2.1,0.1)   (2.2,1.2)   (2.3,2.3)   (2.4,3.4)   (2.5,4.5)
    (3.1,0.1)   (3.2,1.2)   (3.3,2.3)   (3.4,3.4)   (3.5,4.5)
    (4.1,0.1)   (4.2,1.2)   (4.3,2.3)   (4.4,3.4)   (4.5,4.5)
    (5.1,0.1)   (5.2,1.2)   (5.3,2.3)   (5.4,3.4)   (5.5,4.5)
    (6.1,0.1)   (6.2,1.2)   (6.3,2.3)   (6.4,3.4)   (6.5,4.5)
    (7.1,0.1)   (7.2,1.2)   (7.3,2.3)   (7.4,3.4)   (7.5,4.5)
    EOD
    
    stats $Data u 0 nooutput   # get number of columns and rows, separator is whitespace
    colMax = STATS_columns
    rowMax = STATS_records
    
    # create a new datablock with row,col,Real,Imag,Abs
    # using print to datablock
    set print $Data2
        myCmplx(row,col) = word($Data[row+1],col)
        myReal(row,col) = (s=myCmplx(row,col),s[2:strstrt(s,',')-1])
        myImag(row,col) = (s=myCmplx(row,col),s[strstrt(s,',')+1:strlen(s)-1])
        myAbs(row,col)  = sqrt(myReal(row,col)**2 + myImag(row,col)**2)
        do for [row=0:rowMax-1] {
            do for [col=1:colMax] {
                print sprintf("%d %d %s %s %g",row-1,col,myReal(row,col),myImag(row,col),myAbs(row,col))
            }
        }
    set print
    
    set key box opaque
    
    set multiplot layout 2,2
        plot $Data2 u 1:2:3 w image ti "Real part"
        plot $Data2 u 1:2:4 w image ti "Imaginary part"
        set origin 0.25,0
        plot $Data2 u 1:2:5 w image ti "Absolute value"
    unset multiplot
    ### end of code
    

    Result:

    enter image description here