For Linear Regression with 1 Variable and an Intercept, I can compute the RSquare as -
R^2 = (np.sum(((x - np.mean(x)) / np.std(x, ddof=1)) * ((y - np.mean(y)) / np.std(y, ddof=1))) / (len(x) - 1)) ** 2
How do I compute R Square for Linear Regression with 1 Variable and without an intercept, and without having to deal with statsmodels.api
OLS or linregress
or any of the third party packages. Is the understanding correct that np.mean(y) = 0
for Linear Regresssion without intercept?
What is the fastest way in numpy to get the RSquare for Linear Regression with 1 Variable and no intercept?
In the case of one variable with no intercept, you could easily do:
sum(x*y)**2/sum(x*x)/sum(y*y)
In matrix notation this can be written as
(y @ x)**2/(x @ x * y @ y)
For example:
import statsmodels.api as sm
x, y = sm.datasets.get_rdataset('iris').data.iloc[:,:2].values.T
print(sm.OLS(y,x).fit().rsquared)
0.9565098243072627
print((y @ x)**2/(x @ x * y @ y))
0.9565098243072627
Note that the two are equivalent
You could extend the above to include multiple variables:
import statsmodels.api as sm, numpy as np
dat = sm.datasets.get_rdataset('iris').data
x = dat.iloc[:,1:4].values
y = dat.iloc[:,0].values
print(sm.OLS(y, x).fit().rsquared)
0.9961972754365206
print(y @ x @ np.linalg.solve(x.T @ x, x.T @ y) / (y @ y))
0.9961972754365208