I'm using GridsearchCV
for tuning hyperparameters and now I want to do a min-max Normalization(StandardScaler())
in training and validating step.But I think I cannot do this.
The question is :
GridsearchCV
for tuning the parametersIndeed this will cause a data-leak, it's very good that you caught it !
A solution to this using a pipeline, is to make a pipeline with StandardScaler as the first operation in the pipeline, and then your Classifier of choice and eventually pass this pipeline to the GridSearchCV
clf = make_pipeline(StandardScaler(),
MyClassifier())
grid_search = GridSearchCV(clf, refit=True)
For more info, check this article here