Search code examples
python-3.xmultidimensional-arraylinear-regressioncontiguous

Python 3: Met "ndarray is not contiguous" when construct a regression function


This code is designed for calculating a linear regression by defining a function "standRegres" which compile by ourself. Although we can do the lm by the functions in sklearn or statsmodels, here we just try to construct the function by ourself. But unfortunately, I confront error and can't conquer it. So, I'm here asking for your favor to help.

The whole code runs without any problem until the last row. If I run the last row, an Error message emerges: "ValueError: ndarray is not contiguous".

import os

import pandas as pd
import numpy as np
import pylab as pl
import matplotlib.pyplot as plt

from sklearn.datasets import load_iris
# load data
iris = load_iris()
# Define a DataFrame
df = pd.DataFrame(iris.data, columns = iris.feature_names)
# take a look
df.head()
#len(df)


# rename the column name 
df.columns = ['sepal_length','sepal_width','petal_length','petal_width']


X = df[['petal_length']]
y = df['petal_width']


from numpy import *
#########################
# Define function to do matrix calculation
def standRegres(xArr,yArr):
    xMat = mat(xArr); yMat = mat(yArr).T
    xTx = xMat.T * xMat
    if linalg.det(xTx) == 0.0:
        print ("this matrix is singular, cannot do inverse!")
        return NA
    else :
        ws = xTx.I * (xMat.T * yMat)
        return ws

# test
x0 = np.ones((150,1))
x0 = pd.DataFrame(x0)
X0 = pd.concat([x0,X],axis  = 1)

# test
standRegres(X0,y)

This code runs without any problem until the last row. If I run the last row, an Error message emerges: "ValueError: ndarray is not contiguous".

I dry to solve it but don't know how. Could you help me? Quite appreciate for that!


Solution

  • Your problem stems from using the mat function. Stick to array.

    In order to use array, you'll need to use the @ sign for matrix multiplication, not *. Finally, you have a line that says xTx.I, but that function isn't defined for general arrays, so we can use numpy.linalg.inv.

    def standRegres(xArr,yArr):
        xMat = array(xArr); yMat = array(yArr).T
        xTx = xMat.T @ xMat
        if linalg.det(xTx) == 0.0:
            print ("this matrix is singular, cannot do inverse!")
            return NA
        else :
            ws = linalg.inv(xTx) @ (xMat.T @ yMat)
            return ws
    
    # test
    x0 = np.ones((150,1))
    x0 = pd.DataFrame(x0)
    X0 = pd.concat([x0,X],axis  = 1)
    
    # test
    standRegres(X0,y)
    # Output: array([-0.36651405,  0.41641913])