I have plotted a seaborn scatter plot. My data consists of 5000 data points. By looking into the plot, I definitely am not seeing 5000 points. So I'm pretty sure some kind of sampling is performed by seaborn scatterplot function. I want to know how many data points each point in the plot represent? If it depends on the code, the code is as following:
g = sns.scatterplot(x=data['x'], y=data['y'],hue=data['P'], s=40, edgecolor='k', alpha=0.8, legend="full")
Nothing would really suggest to me that seaborn is sampling your data. However, you can check the data in your axes g
to be sure. Query the children of the axes for a PathCollection (scatter plot) object:
g.get_children()
It's probably the first item in the list that is returned. From there you can use get_offsets
to retrieve the data and check its shape.
g.get_children()[0].get_offsets().shape