I am implementing linear regression in python using sklearn.
I have successfully trained model using linear_model.LinearRregression() function.
Now, I want to measure goodnessoffit of the model using AUC ROC method. I am using following code for doing the same :
train_set[predictors1], train_set["loan_status"] = make_classification(n_samples=4000, n_features=2, n_redundant=0, flip_y=0.4)
train, test, train_t, test_t = train_test_split(train_set[predictors1], train_set["loan_status"], train_size=0.9)
rf.fit(train, train_t)
But, getting error in line 1 as below :
ValueError: Must have equal len keys and value when setting with an ndarray
Documentation for make_classification
says the following
Returns:
X : array of shape [n_samples, n_features] The generated samples.y : array of shape [n_samples] The integer labels for class membership of each sample.
looks like the issues is that X
is a list with two arrays and you are attempting to assign both those arrays to one column on your pandas dataframe. You need to isolate which array you want then assign it to the desired column.
_X, df['loan_status'] = make_classification()
df['my_col'] = _X[0]
# or
df['my_col'] = _X[1]