Search code examples
python-3.xmachine-learningscikit-learnlinear-regression

Machine Learning liner Regression - Sklearn


I'm new to the Machine learning domain and in Learn Regression i have some doubt

1:While practicing the sklearn learn regression model prediction method getting the below error.

Code:

sklearn.linear_model.LinearRegression.predict(25)

Error: "ValueError: Expected 2D array, got scalar array instead: array=25. Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample."

Do i need to pass a 2-D array? Checked on sklearn documentation page any haven't found any thing for version update. **Running my code on Kaggle
https://www.kaggle.com/aman9d/bikesharingdemand-upx/

2: Is index of dataset going to effect model's score (weights)?


Solution

  • First of all you should put your code as you use:

    # import, instantiate, fit
    from sklearn.linear_model import LinearRegression
    linreg = LinearRegression()
    linreg.fit(X, y)
    
    # use the predict method
    linreg.predict(25)
    

    Because what you post in the question is not properly executable, predict method is not static for the class LinearRegression.

    When you fit a model, the first step is recognize which kind of data will be the input, in your case will be similar to X, that means that if you pass something with different shape of X to the model it will raise an error.

    In your example X seems to be a pd.DataFrame() instance with only 1 column, this should be replaceable with an array of 2 dimension representing the number of examples by the number of features, so if you try:

    linreg.predict([[25]])
    

    should work.

    For example if you were trying a regression with more than 1 feature aka column, let's say temp and humidity, your input would look like this:

    linreg.predict([[25, 56]])
    

    I hope this will help you and always keep in mind which is the shape of your data.

    Documentation: LinearRegression fit

    X : array-like or sparse matrix, shape (n_samples, n_features)