machine-learning scikit-learn train-test-split k-fold

K-folds do we still need to implement train_test_split?

I've been reading quite a bit and i'm a little confused with k-folds. I understand the concept behind it, but i'm not sure about how to deploy it.

The usual step that i've been seeing after data exploration is train_test_split, encoding and scaling fit_transform the training sets and just fitting the testing sets before testing which algorithms work. After which they tune the hyper-parameters.

So if I were to use k-folds now, do I avoid using train_test_split? And at which do we use k-folds?

Thanks!

Solution

No. K-fold splits your data into train-test split K times so you train K different models.

This approach makes your model results more robust because you train K different models with different parts of your dataset and also you predict different parts of your data K times. Finally, you can simply take the average score of K model.

Training a Keras model to identify leap years
Ideas for Extracting Blade Tip Coordinates from masked Wind Turbine Image
Macro VS Micro VS Weighted VS Samples F1 Score
Doing PyWavelets calculation on GPU
Training loss increases instead of decrease with epochs
How to save a Dataset in multiple shards using `tf.data.Dataset.save`
why explain logit as 'unscaled log probabililty' in sotfmax_cross_entropy_with_logits?
What is the loss function used in Trainer from the Transformers library of Hugging Face?
Using features extracted using a pretrained CNN as new features for an CNN/NN
InvalidArgumentError: No DNN in stream executor while training a TensorFlow RetinaNet model on Google Colab
ALS (Alternating Least Square) algorithm in multiple rankings for a user
How does one set the pad token correctly (not to eos) during fine-tuning to avoid model not predicting EOS?
How to create image of confusion matrix in Python
Cross-validation with nb method
GPU utilization almost always 0 during training Hugging Face Transformer
The “Forward/Backward Passage Size” is too large for the pytorch model (Yolov3)
How many images(minimum) should be there in each classes for training YOLO?
Why do neural networks work so well?
Will larger batch size make computation time less in machine learning?
Why KL divergence is negative in Pytorch?
Creating a voice identification system using machine learning
Forward pass with all samples
override pytorch Dataset efficiently
fit method in sklearn
Calibrating Probabilities in lightgbm or XGBoost
Implementation of F1-score, IOU and Dice Score
Is it ok to have the training history very similar to the validation history?
How to understand Shapley value for binary classification problem?
Stochastic Gradient Descent for Logistic Regression always returns a cost of Inf and weight vector never gets any closer
Data format for Libsvm SVR training in Matlab