I am using python 3.6 to run some statistics test on a data-set. What I am trying to accomplish is to run a t-test between the data-set and the trend line to determine the statistical significance. I and using scipy to do this however I am not sure what variables I should include in the test to get the outcome I need.
Here is my code so far:
import numpy as np
import matplotlib.pyplot as plt
from scipy import stats
p = np.load('data.npy')
#0=1901
start=0
end=100
plt.figure()
plt.plot(a,annualmean, '-')
slope, intercept, r_value, p_value, std_err = stats.linregress(a,annualmean)
plt.plot(a,intercept+slope*a, 'r')
annualmean=[]
for n in range(start,end):
annualmean.append(np.nanmean(p[n]))
#Trendline Plots
a=range(start,end)
year1 = 1901
print(stats.ttest_ind(annualmean,a))
Right now the code is working, no error messages, however I am getting an incredibly small p-value that I don't think is correct. If anyone knows knows what variables I should write into the t-test that would be very helpful. Thanks!
So turns out I was confused about how to test the statistical significance. I already had figured out a p-value for the data in the line:
slope, intercept, r_value, p_value, std_err = stats.linregress(a,annualmean)
All I needed to do was: print(p_value)