Search code examples
pythonmatplotlibseabornscatter-plotsampling

Check if seaborn scatterplot function is sampling data


I have plotted a seaborn scatter plot. My data consists of 5000 data points. By looking into the plot, I definitely am not seeing 5000 points. So I'm pretty sure some kind of sampling is performed by seaborn scatterplot function. I want to know how many data points each point in the plot represent? If it depends on the code, the code is as following:

g = sns.scatterplot(x=data['x'], y=data['y'],hue=data['P'], s=40,  edgecolor='k', alpha=0.8, legend="full")

plot


Solution

  • Nothing would really suggest to me that seaborn is sampling your data. However, you can check the data in your axes g to be sure. Query the children of the axes for a PathCollection (scatter plot) object:

    g.get_children()
    

    It's probably the first item in the list that is returned. From there you can use get_offsets to retrieve the data and check its shape.

    g.get_children()[0].get_offsets().shape