I was trying to plot some points (via geopandas) and discovered adding them gets progressively slower and slower up to the point where each takes a second and more at only few hundreds. This is harly usable and definitely abnormal.
My guess is, matplotlib redraws everything everytime new data is added to the figure, which is backed up by cProfile:
Ordered by: cumulative time
ncalls tottime percall cumtime percall filename:lineno(function)
143915 479.433 0.003 479.602 0.003 collections.py:352(draw)
137549 0.238 0.000 478.297 0.003 collections.py:1014(draw)
...
Searching for a solution to this problem proved hard. I have no idea why the library would redraw on every change when nothing is being displayed. There are some recommendations to use a static backend and to switch to non-interactive mode. Using corresponding configurations however does nothing for the performance, as the library uses static backend and non-interactive mode by default. Stepping through the source I am seeing no point at which interactive mode would be even considered, or a call to a draw() could be conditionally intercepted.
Is there a way to make matplotlib postpone drawing until all data are added to the plot and are ready to be shown?
Here's the script to reproduce the issue. Let's see if it makes a difference.
from pandas import DataFrame
from geopandas import GeoDataFrame, points_from_xy
from matplotlib import pyplot
def plot_point(figure, geodata):
geodata.plot(ax=figure.gca())
figure = pyplot.figure()
data = DataFrame({})
lat = 0
lon = 1
geodata = GeoDataFrame(data, geometry=points_from_xy([lon], [lat]))
for i in range(300):
print(i)
plot_point(figure, geodata)
The answer is "not out of the box". There are no settings to disable redraws in the library.
Some backends might batch redraws as a side effect. The base implementation, which Agg and most others use, redraws whenever it can. The comments never claim to provide any protection from that, but leave it as an option to inheriting classes.
So it is possible to implement a custom backend with a more conservative redraw strategy and use that. In my case the following was enough:
from matplotlib.backend_bases import _Backend
from matplotlib.backends.backend_agg import FigureCanvasAgg, _BackendAgg
class FigureCanvasAggLazy(FigureCanvasAgg):
def draw_idle(self, *args, **kwargs):
pass # No intermediate draws needed if you are only saving to a file
@_Backend.export
class _BackendAggLazy(_BackendAgg):
FigureCanvas = FigureCanvasAggLazy