This code is designed for calculating a linear regression by defining a function "standRegres" which compile by ourself. Although we can do the lm by the functions in sklearn or statsmodels, here we just try to construct the function by ourself. But unfortunately, I confront error and can't conquer it. So, I'm here asking for your favor to help.
The whole code runs without any problem until the last row. If I run the last row, an Error message emerges: "ValueError: ndarray is not contiguous".
import os
import pandas as pd
import numpy as np
import pylab as pl
import matplotlib.pyplot as plt
from sklearn.datasets import load_iris
# load data
iris = load_iris()
# Define a DataFrame
df = pd.DataFrame(iris.data, columns = iris.feature_names)
# take a look
df.head()
#len(df)
# rename the column name
df.columns = ['sepal_length','sepal_width','petal_length','petal_width']
X = df[['petal_length']]
y = df['petal_width']
from numpy import *
#########################
# Define function to do matrix calculation
def standRegres(xArr,yArr):
xMat = mat(xArr); yMat = mat(yArr).T
xTx = xMat.T * xMat
if linalg.det(xTx) == 0.0:
print ("this matrix is singular, cannot do inverse!")
return NA
else :
ws = xTx.I * (xMat.T * yMat)
return ws
# test
x0 = np.ones((150,1))
x0 = pd.DataFrame(x0)
X0 = pd.concat([x0,X],axis = 1)
# test
standRegres(X0,y)
This code runs without any problem until the last row. If I run the last row, an Error message emerges: "ValueError: ndarray is not contiguous".
I dry to solve it but don't know how. Could you help me? Quite appreciate for that!
Your problem stems from using the mat
function. Stick to array
.
In order to use array
, you'll need to use the @
sign for matrix multiplication, not *
. Finally, you have a line that says xTx.I
, but that function isn't defined for general arrays, so we can use numpy.linalg.inv
.
def standRegres(xArr,yArr):
xMat = array(xArr); yMat = array(yArr).T
xTx = xMat.T @ xMat
if linalg.det(xTx) == 0.0:
print ("this matrix is singular, cannot do inverse!")
return NA
else :
ws = linalg.inv(xTx) @ (xMat.T @ yMat)
return ws
# test
x0 = np.ones((150,1))
x0 = pd.DataFrame(x0)
X0 = pd.concat([x0,X],axis = 1)
# test
standRegres(X0,y)
# Output: array([-0.36651405, 0.41641913])