Search code examples
pythondata-visualizationseaborndata-sciencescatter-plot

Python/Seaborn: What does the inside horizontal distribution of the data-points means or is it random?


It seems like that inside-distribution of the histogram data points is almost random every time you plot (using Seaborn) - is it for the ease of readability or other meaningful purpose?

I am using Python 3.0 and Seaborn provided dataset called 'tips' for this question. import seaborn as sns tips = sns.load_dataset("tips")

After I ran my same code below twice I see differences of inside points distribution. Here is the code you can run a couple of times: ax = sns.stripplot(x="day", y="total_bill", data=tips, alpha=.55, palette='Set1', jitter=True, linewidth=1 )

Now, if you look into the plots (if you ran it twice for example) you will notice that the distribution of the points is not the same between 2 plots:

Plot 1 run Plot 2 run

Please explain why points are not distributed identically with 2 separate runs? Also, judging those points on the horizontal scale; is there a reason why (for example) one red point is further left than other red point OR is it simply for readability?

Thank you in advance!


Solution

  • After a bit more research, I believe that the distribution of data points is random but uniform (thank you @ImportanceOfBeingErnest for pointing to the code). Therefore, answering my own questions there is no hidden meaning in terms of distribution and horizontal range is simply set for visibility that also changes or stays the same based on set/notset seed.