Search code examples
pythonmatplotlibscatter-plotvalueerror

ValueError: RGBA values should be within 0-1 range when plotting scatter plot


I am attempting to generate a scatter plot to show data before and after the PCA transform, similar to this tutorial.

To do this, I am running the following code:

fig, axes = plt.subplots(1,2)
axes[0].scatter(X.iloc[:,0], X.iloc[:,1], c=y)
axes[0].set_xlabel('x1')
axes[0].set_ylabel('x2')
axes[0].set_title('Before PCA')
axes[1].scatter(X_new[:,0], X_new[:,1], c=y)
axes[1].set_xlabel('PC1')
axes[1].set_ylabel('PC2')
axes[1].set_title('After PCA')
plt.show()

Which is causing this error to appear:

ValueError: RGBA values should be within 0-1 range

X is the preprocessed matrix of features, which contains 196 samples and 59 features. Whereas y is the dependent variable and contains two classes [0, 1].

Here is the full error message:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-109-2c4f74ddce3f> in <module>
      1 fig, axes = plt.subplots(1,2)
----> 2 axes[0].scatter(X.iloc[:,0], X.iloc[:,1], c=y)
      3 axes[0].set_xlabel('x1')
      4 axes[0].set_ylabel('x2')
      5 axes[0].set_title('Before PCA')

~/anaconda3/lib/python3.7/site-packages/matplotlib/__init__.py in inner(ax, data, *args, **kwargs)
   1597     def inner(ax, *args, data=None, **kwargs):
   1598         if data is None:
-> 1599             return func(ax, *map(sanitize_sequence, args), **kwargs)
   1600 
   1601         bound = new_sig.bind(ax, *args, **kwargs)

~/anaconda3/lib/python3.7/site-packages/matplotlib/axes/_axes.py in scatter(self, x, y, s, c, marker, cmap, norm, vmin, vmax, alpha, linewidths, verts, edgecolors, plotnonfinite, **kwargs)
   4495                 offsets=offsets,
   4496                 transOffset=kwargs.pop('transform', self.transData),
-> 4497                 alpha=alpha
   4498                 )
   4499         collection.set_transform(mtransforms.IdentityTransform())

~/anaconda3/lib/python3.7/site-packages/matplotlib/collections.py in __init__(self, paths, sizes, **kwargs)
    881         """
    882 
--> 883         Collection.__init__(self, **kwargs)
    884         self.set_paths(paths)
    885         self.set_sizes(sizes)

~/anaconda3/lib/python3.7/site-packages/matplotlib/collections.py in __init__(self, edgecolors, facecolors, linewidths, linestyles, capstyle, joinstyle, antialiaseds, offsets, transOffset, norm, cmap, pickradius, hatch, urls, offset_position, zorder, **kwargs)
    125 
    126         self._hatch_color = mcolors.to_rgba(mpl.rcParams['hatch.color'])
--> 127         self.set_facecolor(facecolors)
    128         self.set_edgecolor(edgecolors)
    129         self.set_linewidth(linewidths)

~/anaconda3/lib/python3.7/site-packages/matplotlib/collections.py in set_facecolor(self, c)
    676         """
    677         self._original_facecolor = c
--> 678         self._set_facecolor(c)
    679 
    680     def get_facecolor(self):

~/anaconda3/lib/python3.7/site-packages/matplotlib/collections.py in _set_facecolor(self, c)
    659         except AttributeError:
    660             pass
--> 661         self._facecolors = mcolors.to_rgba_array(c, self._alpha)
    662         self.stale = True
    663 

~/anaconda3/lib/python3.7/site-packages/matplotlib/colors.py in to_rgba_array(c, alpha)
    277             result[mask] = 0
    278         if np.any((result < 0) | (result > 1)):
--> 279             raise ValueError("RGBA values should be within 0-1 range")
    280         return result
    281     # Handle single values.

ValueError: RGBA values should be within 0-1 range

I am unsure what is causing this error and would appreciate help in figuring this out. Thanks!


Solution

  • The c= parameter of ax.scatter can be given in several ways:

    • A scalar or sequence of n numbers to be mapped to colors using cmap and norm. So a single number, or a list-like 1D sequence of numbers.
    • A 2D array in which the rows are RGB or RGBA. E.g. something like [[1,0,0], [0,0,1]]. All these values need to be between 0 and 1. Moreover, there should be either 3 (for RGB) or 4 (for RGBA) values per entry.
    • A sequence of colors of length n. E.g. ["red", "#B789C0", "turquoise"]
    • A single color format string. E.g. "cornflowerblue".

    Now, when an array of numbers is given, to be able to distinguish between the first and the second case, matplotlib just looks at the array dimension. If it is 1D, matplotlib assumes the first case. For 2D, it assumes the second case. Note that also an Nx1 or an 1xN array is considered 2D. You can use np.squeeze() to "squeeze out" the dummy second dimension.