Trying to plot a set of points where a couple of the points have extreme Y values. Is there a way to showcase those values on a graph without losing the value of the other points. For example:
[(1, 10), (2, 33), (3, 100000), (4, 17), (5, 45), (6, 8), (7, 950000)]
I'm picturing a graph with Y ticks 0-50 and then similar sized gap between 50-100000 to showcase the extreme point.
Not sure whether matplotlib
can do it automatically, but there are several things you try:
Use a log-scale y-axis, i.e. ax.set_yscale('log')
. This will to some extent decrease the difference between regular points and outlier points, but you will still see all regular data lying at the bottom of the graph.
Manually change the y-value of the outliers to some not-that-extreme value. You can later show those points are of some value by manually setting display-content of y-axis using plt.yticks(actual_data_array, what_to_display_array)
Generate a regular (high enough) plot, then do image processing later to cut the middle part of the image. There might be a better way, but one way to do that would be plt.savefig
to save the plot, matplotlib.image.imread
to read the plot, and finally process it using plt.imshow(np.concatenate((image_data[:100], image_data[-100:]), 0))
Here is an example of the second method I was talking about:
import numpy as np
import matplotlib.pyplot as plt
a = np.array(((1, 10), (2, 33), (3, 100000), (4, 17), (5, 45), (6, 8), (7, 950000)), 'f')
# scale the array so that matplotlib can plot it "uniformly"
a[a[:,1]>99999,1] = a[a[:,1]>99999,1] / 20000 + 55
plt.plot(*a.T)
# do the displaying trick
plt.yticks(np.r_[np.linspace(0, 50, 5),
np.linspace(100000/20000+55, 950000/20000+55, 5)],
np.r_[np.linspace(0, 50, 5, dtype='i'),
np.linspace(100000, 950000, 5, dtype='i')])
plt.grid()
It looks like this: