Search code examples
pythonpandasdataframescikit-learnprediction

AttributeError: 'NoneType' object has no attribute 'drop' when merging two functions


I have a dataframe for which I predicted the result using XGBoost (all the necessary imports are made and I will not write them anymore):

studentId       testId    result       Length     Words      picture     
s1              t1        0            10         8.50       0            
s1              t2        0            11         9.80       1           
s1              t3        1            11        10.40       1           
s2              t2        0            11         9.80       1           
s2              t4        1            60         9.99       0           
s3              t7        1            40         6.45       0            




cols_to_drop = ['testId', 'studentId']
df.drop(cols_to_drop, axis=1, inplace=True)
X = df.drop('result', axis=1) 
y = df['result']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=5)
model = XGBClassifier() 
model.fit(X_train, y_train) 
y_pred = model.predict(X_test) 

I have a part of this dataframe for which I can also predict the result in a different way using surprise, not using all the above features:

studentId       testId            result
s1              t1                0
s1              t2                0
s1              t3                1
s2              t2                0
s2              t4                1
s3              t7                1

reader = Reader(rating_scale=(0, 1))
data = Dataset.load_from_df(df_small[['studentId', 'testId', 'result']], reader)
trainset, testset = train_test_split(data, test_size=0.25)
algo = KNNWithMeans()
algo.fit(trainset)
test = algo.test(testset)
test = pd.DataFrame(test)
test.drop("details", inplace=True, axis=1)
test.columns = ['userId', 'questionId', 'actual', 'cf_predictions']

Now, I want to create a model that combines the two and assigns different weights to each model. I tried to write the things above as functions and then everything as a big function:

def model_1(df):
    cols_to_drop = ['testId', 'studentId']
    new_df=df.drop(cols_to_drop, axis=1, inplace=True)
    X = new_df.drop('result', axis=1) 
    y = new_df['result']
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=5)
    model = XGBClassifier() 
    model.fit(X_train, y_train) 
    y_pred = model.predict(X_test)
    return y_test, y_pred

def model_2(df):
    reader = Reader(rating_scale=(0, 1))
    data = Dataset.load_from_df(df[['studentId', 'testId', 'result']], reader)
    trainset, testset = train_test_split(data, test_size=0.25)
    algo = KNNWithMeans()
    algo.fit(trainset)
    test = algo.test(testset)
    test = pd.DataFrame(test)
    test.drop("details", inplace=True, axis=1)
    test.columns = ['studentId', 'testId', 'actual', 'cf_predictions']
    return test

def merged_models(df):
    first_model = model_1(df)
    second_model = model_2(df)

    prediction = 0.5 * first_model + 0.5 * second_model # weights example
    return prediction

The first two work, but merged_models(df) doesn't even get to apply model_1 because AttributeError: 'NoneType' object has no attribute 'drop' at X = new_df.drop('result', axis=1). The code is probably a mess, but is there any way of combining such two different models and being able to also evaluate this "hybrid"?


Solution

  • df.drop does not return anything when inplace is set to True. It modifies the DataFrame in place and returns None. You don't need to create new names for them.