Hyperparameter optimisation in Python with a separate validation set

I am trying to optimise the hyper parameters of a random forest regressor in Python.

I have 3 separate datasets: train/validate/test. Therefore, rather than using a cross validation method I want to use the specific validation set to tune the hyperparameters, i.e. the "First Approach" described in this stackoverflow post.

Now, sklearn has some nice inbuilt methods for hyperparameter optimisation using cross validation (e.g. this tutorial), but what about if I want to tune my hyperparameters with a specific validation set? Is it still possible to use a method like RandomizedSearchCV?

Solution

It is indeed possible with cv option. As the documentation suggests, one of the possible inputs is an iterable of train/test index tuples:

An iterable yielding (train, test) splits as arrays of indices.

So, a list of size one with train and validation indices packed as a tuple would be ok.

ValuerError: Found input variables with inconsistent numbers of samples
Using ranking data in Logistic Regression
SHAP - instances that have more than one dimension
Unity ML-Agents --num-envs get env ID
XGBClassifier.fit() got an unexpected keyword argument 'early_stopping_rounds'
How do I scrape a whole table when they all use the same class?, I can so far get the names only
How is nn.Linear applied to a higher dimensional data?
tensorflow:Can save best model only with val_acc available, skipping
Keras: Difference between Kernel and Activity regularizers
How can one run a Core ML model on macOS 10.12?
How to fix the learning-rate for Huggingface´s Trainer?
Roboflow Vs. Darknet for generating weight file and creating the model
What is the difference between backpropagation and reverse-mode autodiff?
Load Registered Component in Azure ML for Pipeline using Python sdk v2
classifiers in scikit-learn that handle nan/null
Encoding two categorial data present in same dataset in Deep learning
Machine learning in q/kdb+
Why do the sensitivity (recall) values differ between classification_report and precision_recall_fscore_support in a loop?
How do you specify the bfloat16 mixed precision with the Intel Extension for PyTorch?
mlflow doesn't autolog artifacts while logging images
Choosing top k models using GridSearchCV in scikit-learn
Calculating pairwise distances between entries in a `torch.tensor`
Reusing a feature to split regression decision tree's nodes
Fine tuning LayoutLmv3 using Cord-V2 dataset
What does model.eval() do in pytorch?
ElasticNetCV in Python: Get full grid of hyperparameters with corresponding MSE?
Why is the accuracy for my Keras model always 0 when training?
Polars - issues with performance - attempting to create a new dataframe per row
Keras model.export() fails because of NoneType shapes in model
Meta-feature analysis: split data for computation on available memory