
K-fold cross-validation
You've already seen a form of cross-validation before; holding out a portion of our data is the simplest form of cross- validation that we can have. While this is generally a good practice, it can sometimes leave important features out of the training set that can create poor performance when it comes time to test. To remedy this, we can take standard cross validation a step further with a technique called k-fold cross validation.
In k-fold cross validation, our dataset is evenly divided in k event parts, chosen by the user. As a rule of thumb, generally you should stick to k = 5 or k = 10 for best performance. The model is then trained and tested k times over. During each training episode, one k segment of the data held out as a testing set and the other segments used as training. You can think of this like shuffling a deck of cards - each time we are taking one card out for testing, and leaving the rest for training. The total accuracy of the model and it's error is then the combination of all of the train/test episode that were conducted.
There are some models, such as Logistic Regression and Support vector machines, which benefit from k-fold cross validation. Neural network models, such as the ones that we will be discussing in the coming chapter, also benefit from k-fold cross validation methods. Random Forest models like we described precedingly, on the other hand, do not require k-fold cross-validation. K-fold is used as a tuning and optimization method for balancing feature importances, and Random Forests already contain a measure of feature importance.