Search code examples
pythonclassificationrandom-forestgridsearchcvmake-scorer

GridSeachCV custom profit function results with an error: missing 1 required positional argument: 'y'


I am trying to optimize my model with GridSearchCV, using a custom profit function. However, when I Run my code, I end up with the following error message: TypeError: profit_scorer() missing 1 required positional argument: 'y''

Here is my code:

sale_revenue = 11
call_cost = 3
false_positive_cost = call_cost
def calculate_profit(y_true, y_pred):
    tn, fp, fn, tp= confusion_matrix(y_true, y_pred).ravel() #tn, fp, fn, tp
    profit_from_sales = tp * (sale_revenue - call_cost)
    loss_from_wasted_calls = fp * false_positive_cost
    total_profit = profit_from_sales - loss_from_wasted_calls
    return total_profit
def profit_scorer(estimator, X, y):
    y_pred = estimator.predict(X)
    return calculate_profit(y, y_pred)
scorer = make_scorer(profit_scorer, greater_is_better=True)
rf = RandomForestClassifier(random_state=42, class_weight="balanced")
param_grid = {
    'n_estimators': [50, 100, 200],
    'max_depth': [3,4,5],
    'criterion': ['gini', 'entropy'],
    'max_features': ['sqrt', 'log2'],
    'bootstrap': [True, False]}
grid_search = GridSearchCV(
    estimator=rf,
    param_grid=param_grid, 
    cv=5, 
    scoring= scorer, 
    verbose=1)
grid_search.fit(X, y)`

Solution

  • According to the docs, the profit_scorer() should have a signature of profit_scorer(y, y_pred, **kwargs).

    make_scorer() transforms it into scorer(estimator, X, y_true, **kwargs).

    Changing the profit_scorer() declaration to:

    def profit_scorer(y_true, y_pred):
        return calculate_profit(y_true, y_pred)
    

    Should suffice.