Search code examples
pythonpandasseabornp-valuepearson-correlation

Pearsonr and p-value


I am analyzing some data in pandas and plotting correlations between two variables using sns.jointplot() function. The results for correlation between these two function looks like this: enter image description here

The value for pearsonr is 0.41 and p is 5e-18. What can i infer from these two values. Is there a good relationship between these two variables are not.

Also if I want to just display pearsonr on the plot, how should I change my code. Below is the code that I a using currently.

ax=sns.jointplot(df['Comfort'], df['Assurance'],data=df, kind="kde", color='r');

Solution

  • The value for pearsonr is 0.41 and p is 5e-18. What can i infer from these two values. Is there a good relationship between these two variables are not.

    Roughly speaking:

    • The size of a correlation coefficient (0.41) suggests a low positive correlation.
    • p-value (5e-18) suggests that the correlation coefficient is statistically significant, being much less than 0.01 (0.01 ---> the risk of concluding that a correlation exists when, actually, no correlation exists is 1%).
    • please, remember that Pearson correlation coefficient only measures linear relationships. You can get Pearson correlation coefficient 0 for variables (datasets) with a strong nonlinear relationship. Moreover, you are assuming that your variables (datasets) are normally distributed.

    Also if I want to just display pearsonr on the plot, how should I change my code.

    seaborn 0.9.0 does not display that information. To add that information, you can compute the value using scipy.stats.pearsonr, then showing it as part of the title of your figure.