Hello everyone I made a scatter plot based on two lists. Now I want to color the scatter plot based on the y-axis value. For example, if the value on the y-axis is greater than 30000 I want to color it red and rest all values blue? What's the best way to do this
If you are using Numpy's ndarrays, it's simpler
import numpy as np
import matplotlib.pyplot as plt
# test data
y = np.random.randint(2800, 3100, size=(100,))
x = np.arange(0, 100)
# create a Boolean array (a mask), possibly negate it using the "~" unary operator
ygt3000 = y>3000
plt.scatter(x[~ygt3000], y[~ygt3000], color='blue')
plt.scatter(x[ygt3000], y[ygt3000], color='red')
if you are using real lists, it's a bit more complicated, but can be done using list comprehensions
x = x.tolist()
y = y.tolist()
ygt3000 = [val>3000 for val in y]
plt.scatter([xv for xv, ygt in zip(x, ygt3000) if not ygt],
[yv for yv, ygt in zip(y, ygt3000) if not ygt], color='blue')
plt.scatter([xv for xv, ygt in zip(x, ygt3000) if ygt],
[yv for yv, ygt in zip(y, ygt3000) if ygt], color='red')
Here it is the result of the code above when applied to two sequences of random numbers.
August 2021, because Trenton McKinney made a beautiful edit (thank you Trenton) this post came again to my attention, and I saw the light
plt.scatter(x, y, c=['r' if v>3000 else 'b' for v in y])
Just a day later, I realized that a similar feat can be used with Numpy, taking advantage of advanced indexing
plt.scatter(x, y, c=np.array(('b','r'))[(y>3000).astype(int)])
but honestly I prefer the two-pass approach I've used previously, because it's more to the point and conveys much more meaning. Or, in other words, the latter looks obfuscated code...