May 1, 2023Open Access

Determining the optimal number of folds to use in a K-fold cross-validation: A neural network classification experiment

Key Points

Key points are not available for this paper at this time.

Abstract

A large dataset is needed to obtain a large learning set for a suitable classifier, while a large testing set is needed for a good estimate of the classifier’s performance (i.e. error probability). With a small dataset, after its random partitioning into learning and testing sets, both sets would end up consisting of smaller samples, which then becomes difficult to use when seeking to obtain a suitable classifier from the learning set and a good estimate of its performance from the testing set. The K-fold cross-validation approach has been every so often suggested to overcome the problem of not being able to obtain a suitable classifier and a good estimate of its performance. Thus, the objective of this study experiment was to determine the optimal number of folds to use in a K-fold cross-validation, and this was done in a simulation way using an artificial two-class normal mixture dataset with a total of 1000 samples and the resilient back propagation learning method over 10,000 training epochs, with and without early stopping applications during the training of the neural networks.

Determining the optimal number of folds to use in a K-fold cross-validation: A neural network classification experiment

Key Points

Abstract

Cite This Study

Also Consider

Also Consider