Plotting large arrays in pyqtgraph

For an electrophysiology data analysis set I need to plot a large 2D array (dim approx 20.000 x 120) of points. I used to embed a Matplotlib widget in my PyQt application, but went looking for other solutions because the plotting took quite long. Still, plotting the data with pyqtgraph also takes much longer then expected, probably because it redraws the widget everytime when using the plot() function.

What is the best practice to plot large arrays?

The pyqtgraph examples, although extensive, didn't help me much further...

import pyqtgraph as pg
view = pg.GraphicsLayoutWidget()
w1 = view.addPlot()

for n in data:
    w1.plot(n)

w1.plot(data)

The last rule generates ValueError: operands could not be broadcast together with shapes (10) (10,120)

Thanks in advance....

Solution

See this discussion: https://groups.google.com/forum/?fromgroups#!searchin/pyqtgraph/arraytoqpath/pyqtgraph/CBLmhlKWnfo/jinNoI07OqkJ

Pyqtgraph does not redraw after every call to plot(); it will wait until control returns to the Qt event loop before redrawing. However, it is possible that your code forces the event loop to be visited more frequently by calling QApplication.processEvents() (this can happen indirectly e.g. if you have a progress dialog).

Generally, the most important rule about improving performance is: profile your code. Do not make assumptions about what might be slowing you down if you can instead measure that directly.

Since I don't have access to your code, I can only guess how to improve it and show you how profiling can help. I'm going to start with the 'slow' example here and work through a few improvements.

1. The slow implementation

import pyqtgraph as pg
import numpy as np
app = pg.mkQApp()
data = np.random.normal(size=(120,20000), scale=0.2) + \
       np.arange(120)[:,np.newaxis]
view = pg.GraphicsLayoutWidget()
view.show()
w1 = view.addPlot()
now = pg.ptime.time()
for n in data:
    w1.plot(n)
print "Plot time: %0.2f sec" % (pg.ptime.time()-now)
app.exec_()

The output of this is:

Plot time: 6.10 sec

Now let's profile it:

$ python -m cProfile -s cumulative speed_test.py
. . .
     ncalls  tottime  percall  cumtime  percall filename:lineno(function)
          1    0.001    0.001   11.705   11.705 speed_test.py:1(<module>)
        120    0.002    0.000    8.973    0.075 PlotItem.py:614(plot)
        120    0.011    0.000    8.521    0.071 PlotItem.py:500(addItem) 
    363/362    0.030    0.000    7.982    0.022 ViewBox.py:559(updateAutoRange)
. . .

Already we can see that ViewBox.updateAutoRange is taking a lot of time, so let's disable auto-ranging:

2. A bit faster

import pyqtgraph as pg
import numpy as np
app = pg.mkQApp()
data = np.random.normal(size=(120,20000), scale=0.2) + \
       np.arange(120)[:,np.newaxis]
view = pg.GraphicsLayoutWidget()
view.show()
w1 = view.addPlot()
w1.disableAutoRange()
now = pg.ptime.time()
for n in data:
    w1.plot(n)
w1.autoRange() # only after plots are added
print "Plot time: %0.2f sec" % (pg.ptime.time()-now)
app.exec_()

..and the output is:

Plot time: 0.68 sec

So that's a bit faster, but panning/scaling the plot is still quite slow. If I look at the profile after dragging the plot for a while, it looks like this:

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.034    0.034   16.627   16.627 speed_test.py:1(<module>)
        1    1.575    1.575   11.627   11.627 {built-in method exec_}
       20    0.000    0.000    7.426    0.371 GraphicsView.py:147(paintEvent)
       20    0.124    0.006    7.425    0.371 {paintEvent}
     2145    0.076    0.000    6.996    0.003 PlotCurveItem.py:369(paint)

So we see a lot of calls to PlotCurveItem.paint(). What if we put all 120 plot lines into a single item to reduce the number of paint calls?

3. The fast implementation

After a couple rounds of profiling, I came up with this. It's based on using pg.arrayToQPath, as suggested in the thread above:

import pyqtgraph as pg
import numpy as np
app = pg.mkQApp()

y = np.random.normal(size=(120,20000), scale=0.2) + np.arange(120)[:,np.newaxis]
x = np.empty((120,20000))
x[:] = np.arange(20000)[np.newaxis,:]
view = pg.GraphicsLayoutWidget()
view.show()
w1 = view.addPlot()

class MultiLine(pg.QtGui.QGraphicsPathItem):
    def __init__(self, x, y):
        """x and y are 2D arrays of shape (Nplots, Nsamples)"""
        connect = np.ones(x.shape, dtype=bool)
        connect[:,-1] = 0 # don't draw the segment between each trace
        self.path = pg.arrayToQPath(x.flatten(), y.flatten(), connect.flatten())
        pg.QtGui.QGraphicsPathItem.__init__(self, self.path)
        self.setPen(pg.mkPen('w'))
    def shape(self): # override because QGraphicsPathItem.shape is too expensive.
        return pg.QtGui.QGraphicsItem.shape(self)
    def boundingRect(self):
        return self.path.boundingRect()

now = pg.ptime.time()
lines = MultiLine(x, y)
w1.addItem(lines)
print "Plot time: %0.2f sec" % (pg.ptime.time()-now)

app.exec_()

It starts quickly and panning/scaling is reasonably responsive. I'll stress, though, that whether this solution works for you will likely depend on the details of your program.