Search code examples
pandasscipyt-test

What is the significance of t-stats value while applying ttest_ind on two pandas series?


what conclusion can be drawn from the resulting t-stats value When ttest_ind is applied on two independent series?


Solution

  • As you can read here, the scipy.stats.ttest_ind has two outputs

    • The calculated t-statistic.
    • The two-tailed p-value.

    Very intuitively, you can read the t-statistic as a normalized difference of averages in both populations, considering their variances and sizes:

    • The larger are the samples, the more serious the difference of averages is because we have more evidence for that.
    • The larger are the variances, the less serious the difference of averages is because the absolute difference can be given by randomness only.

    The higher is the value of the t-statistic, the more serious is the difference.

    The p-value makes this intuition more explicit: it is the probability that the difference of averages can be considered as zero. If the p-value is bellow a threshold, e.g. 0.05, we say that the difference in not zero.