Search code examples
pythonpandasscikit-learnsklearn-pandas

sklearn Linear Regression coefficients have single value output


I am using dataset to see the relationship between salary and college GPA. I am using sklearn linear regression model. I think the coefficients should be intercept and the coff. value of corresponding feature. But the model is giving a single value.

from sklearn.cross_validation import train_test_split
from sklearn.linear_model import LinearRegression

# Use only one feature : CollegeGPA
labour_data_gpa = labour_data[['collegeGPA']]

# salary as a dependent variable
labour_data_salary = labour_data[['Salary']]

# Split the data into training/testing sets
gpa_train, gpa_test, salary_train, salary_test = train_test_split(labour_data_gpa, labour_data_salary)

# Create linear regression object
 regression = LinearRegression()

# Train the model using the training sets (first parameter is x )
 regression.fit(gpa_train, salary_train)

#coefficients 
regression.coef_

The output is : Out[12]: array([[ 3235.66359637]])

Solution

  • Try:

    regression = LinearRegression(fit_intercept =True)
    regression.fit(gpa_train, salary_train)
    

    and the results will be in

    regression.coef_
    regression.intercept_
    

    In order to get a better understanding of your linear regression, you maybe should consider another module, the following tutorial helps: http://statsmodels.sourceforge.net/devel/examples/notebooks/generated/ols.html