Search code examples
python-3.xscikit-learnlinear-regression

ValueError: Expected 2D array, got 1D array instead:


While practicing Simple Linear Regression Model I got this error, I think there is something wrong with my data set.

Here is my data set:

Here is independent variable X:

Here is dependent variable Y:

Here is X_train

Here Is Y_train

This is error body:

ValueError: Expected 2D array, got 1D array instead:
array=[ 7.   8.4 10.1  6.5  6.9  7.9  5.8  7.4  9.3 10.3  7.3  8.1].
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.

And this is My code:

import pandas as pd
import matplotlib as pt

#import data set

dataset = pd.read_csv('Sample-data-sets-for-linear-regression1.csv')
x = dataset.iloc[:, 1].values
y = dataset.iloc[:, 2].values

#Spliting the dataset into Training set and Test Set
from sklearn.cross_validation import train_test_split
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size= 0.2, random_state=0)

#linnear Regression

from sklearn.linear_model import LinearRegression

regressor = LinearRegression()
regressor.fit(x_train,y_train)

y_pred = regressor.predict(x_test)

Thank you


Solution

  • You need to give both the fit and predict methods 2D arrays. Your x_train and x_test are currently only 1 dimensional. What is suggested by the console should work:

    x_train= x_train.reshape(-1, 1)
    x_test = x_test.reshape(-1, 1)
    

    This uses numpy's reshape to transform your array. For example, x = [1, 2, 3] wopuld be transformed to a matrix x' = [[1], [2], [3]] (-1 gives the x dimension of the matrix, inferred from the length of the array and remaining dimensions, 1 is the y dimension - giving us a n x 1 matrix where n is the input length).

    Questions about reshape have been answered in the past, this for example should answer what reshape(-1,1) fully means: What does -1 mean in numpy reshape? (also some of the other below answers explain this very well too)