September 20, 2024Open Access

Optimization of Classification Algorithms Performance with k-Fold Cross Validation

Key Points

Key points are not available for this paper at this time.

Abstract

Supervised learning is a predictive method used to make predictions or classifications. Supervised learning algorithms work by building a model using training data that includes both independent and dependent variables. Several methods for building classification include Logistic Regression, Naive Bayes, K-Nearest Neighbor (KNN), decision tree, etc. The lack of capacity of a classification algorithm to generalize certain data can be associated with the problem of overfitting or underfitting. K-fold cross-validation is a method that can help avoid overfitting or underfitting and produce a algorithm with good performance on new data. This study will test the Naive Bayes, K-Nearest Neighbor (KNN), Classification and Regression Tree (CART), and Logistic Regression methods with k-fold cross-validation on two different datasets. The values of k set for cross-validation are 2, 3, 5, 7, and 10. The analysis results concluded that each classification algorithm performed best at 10-fold cross-validation. In DATA 1, the Naive Bayes algorithm has the highest average accuracy of 0.67 (67%) and the error rate is 0.33 (33%), followed by the CART algorithm, KNN, and finally logistic regression. While DATA 2, the KNN algorithm has the highest average accuracy of 0.66 (66%) and an error rate of 0.34 (34%), followed by the CART algorithm, Naive Bayes, and finally logistic regressionbut can be a reference if you want to predict the growth direction of the accommodation and food service activities sector.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Moch. Anjas Aprihartha

Idham Idham

Journals

EIGEN MATHEMATICS JOURNAL

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Optimization of Classification Algorithms Performance with k-Fold Cross Validation

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study