Search code examples
pythonmatplotlibgeopandasshapely

How to extract a geopandas plot as a numpy array that consists of numerical values of the pixels?


I have a GeoDataFrame and I want to get a numpy array that corresponds to the GeoDataFrame.plot().

At the moment, my code looks like this:

import numpy as np
import geopandas as gpd
from shapely.geometry import Polygon
import matplotlib.pyplot as plt
from PIL import Image

# Create GeoDataFrame
poly_list = [Polygon([[0, 0], [1, 0], [1, 1], [0, 1]])]
polys_gdf = gpd.GeoDataFrame(geometry=poly_list)

# Save plot with matplotlib
plt.ioff()
polys_gdf.plot()
plt.savefig('plot.png')
plt.close()

# Open file and convert to array
img = Image.open('plot.png')
arr = np.array(img.getdata())

This is a minimal working example. My actual problem is that I have a list of thousands of GeoDataFrames, 'list_of_gdf'.

My first idea was to just run that in a loop:

arr_list = []
for element in list_of_gdf:
    plt.ioff() 
    element.plot()
    plt.savefig('plot.png')
    plt.close()

    img = Image.open('plot.png')
    arr_list.append(np.array(img.getdata()))

This seems like it could be done in a faster way, instead of saving and opening every single .png-file for example. Any ideas?


Solution

  • I found a working solution for me. Instead of saving and opening every picture as .png, I use matplotlib "backend agg to acces the figure canvas as an RGB string and then convert it ot an array" (https://matplotlib.org/3.1.0/gallery/misc/agg_buffer.html).

    arr_list = []
    for element in list_of_gdf:
        plt.close('all')
        fig, ax = plt.subplots()
        ax.axis('off')
        element.plot(ax = ax)
        fig.canvas.draw()
        arr = np.array(fig.canvas.renderer.buffer_rgba())
        arr_list.append(arr)