Search code examples
pythonmachine-learningscikit-learnlogistic-regressionlasso-regression

coefficient from logistic regression to write function in python


I just completed logistic regression. The data can be downloaded from below link: pleas click this link to download the data

Below is the code to logistic regression.

from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import roc_auc_score
import pandas as pd
scaler = StandardScaler()

data = pd.read_csv('data.csv')
dataX = data.drop('outcome',axis =1).values.astype(float)
X     = scaler.fit_transform(dataX)
dataY = data[['outcome']]
Y = dataY.values

X_train,X_test,y_train,y_test = train_test_split (X,Y,test_size = 0.25, random_state = 33)

lr = LogisticRegression()
lr.fit(X_train,y_train)

# Predict the probability of the testing samples to belong to 0 or 1 class
predicted_probs = lr.predict_proba(X_test)
print(predicted_probs[0:3])
print(lr.coef_)

i can print the coefficient of logistic regression and i can compute the probability of an event to occur 1 or 0.

When I write a python function using those coefficients and compute the probability to occur 1. I am not getting answer as compared using this :lr.predict_proba(X_test)

the function i wrote is as follow:

def xG(bodyPart,shotQuality,defPressure,numDefPlayers,numAttPlayers,shotdist,angle,chanceRating,type):
coeff = [0.09786083,2.30523761, -0.05875112,0.07905136, 
         -0.1663424 ,-0.73930942,-0.10385882,0.98845481,0.13175622]

return  (coeff[0]*bodyPart+ coeff[1]*shotQuality+coeff[2]*defPressure+coeff[3]*numDefPlayers+coeff[4]*numAttPlayers+coeff[5]*shotdist+ coeff[6]*angle+coeff[7]*chanceRating+coeff[8]*type)

I got the weird answer. I knew sth wrong in the function calculation.

May i seek your advice as I am new to machine learning and statistics.


Solution

  • I think you missed the intercept_ in your xG. You can retrieve it from lr.intercept_ and it should be summed in the final formula:

    return 1/(1+e**(-(intercept + coeff[0]*bodyPart+ coeff[1]*shotQuality+coeff[2]*defPressure+coeff[3]*numDefPlayers+coeff[4]*numAttPlayers+coeff[5]*shotdist+ coeff[6]*angle+coeff[7]*chanceRating+coeff[8]*type))