Search code examples
imagenumpyunity-game-enginematplotlibml-agent

Getting the dimensions of a numpy array right to plot converted greyscale image


as part of Unity's ML Agents images fed to a reinforcement learning agent can be converted to greyscale like so:

def _process_pixels(image_bytes=None, bw=False):
    s = bytearray(image_bytes)
    image = Image.open(io.BytesIO(s))
    s = np.array(image) / 255.0
    if bw:
        s = np.mean(s, axis=2)
        s = np.reshape(s, [s.shape[0], s.shape[1], 1])
    return s

As I'm not familiar enough with Python and especially numpy, how can I get the dimensions right for plotting the reshaped numpy array? To my understanding, the shape is based on the image's width, height and number of channels. So after reshaping there is only one channel to determine the greyscale value. I just didn't find a way yet to plot it yet.

Here is a link to the mentioned code of the Unity ML Agents repository.

That's how I wanted to plot it:

plt.imshow(s)
plt.show()

Solution

  • Won't just doing this work?

    plt.imshow(s[..., 0])
    plt.show()
    

    Explanation

    plt.imshow expects either a 2-D array with shape (x, y), and treats it like grayscale, or dimensions (x, y, 3) (treated like RGB) or (x, y, 4) (treated as RGBA). The array you had was (x, y, 1). To get rid of the last dimension we can do Numpy indexing to remove the last dimension. s[..., 0] says, "take all other dimensions as-is, but along the last dimension, get the slice at index 0".