Search code examples
pythonnumpylinear-regression

Ordinary Least Squares linear regression predictions from X & Beta without Sklearn


how can I obtain the predictions having the Beta and X values from the following:

Data

import numpy as np
x = np.array([20, 25, 28, 29, 30])
y = np.array([20000, 25000, 26000, 30000, 32000])

Transform

# Reshape X    
X = x.reshape((-1, 1))

# Append ones to X:
X = np.append(np.ones((len(X), 1)), X, axis=1)

# Matrix X
X = np.matrix(X)

# Transpose X
XT = np.matrix.transpose(X)

# Matrix y
y = y.reshape((-1, 1))
y = np.matrix(y)

# Multiplication
XT_X = np.matmul(XT, X)
XT_y = np.matmul(XT, y)

# Betas
Betas = np.matmul(np.linalg.inv(XT_X), XT_y)

Now I need to obtain my prediction to plot and see how well it fitted:

import seaborn as sns
X = x
sns.regplot(X, prediction, fit_reg=False)
sns.regplot(X, y, fit_reg=False)

How can I calculate that prediction, obtain my new y's based on the X values?

Thanks!


Solution

  • It is X @ Betas:

    predictions = np.array(X @ Betas).ravel()
    x_values = np.array(X[:, 1]).ravel()  # disregard the 1s column which is for bias
    y_values = np.array(y).ravel()
    
    sns.regplot(x_values, predictions, fit_reg=False, label="preds")
    sns.regplot(x_values, y_values, fit_reg=False, label="truth")
    plt.legend()
    

    I'm not sure why you opted for np.matrix; I think you can stick with np.array.