In recent years, the development of Large Language Models (LLMs) has reached a state of maturity, resulting in texts that are becoming increasingly indistinguishable from those written by humans. This progress has sparked a growing need for precise and efficient methods to detect Artificial Intelligence (AI)-generated texts, as the blend of human and machine authorship becomes more seamless. This study utilizes data from student-written papers and articles generated by various LLMs to develop a machine learning model capable of accurately distinguishing whether an article was written by a student or an LLM. Four different classification models (MultinomialNB, SGDClassifier, LGBMClassifier, and CatBoostClassifier) and their ensemble models with weighted combinations were chosen to detect AI-generated texts. To evaluate the effectiveness of these models, the Area Under Curve (AUC) score metric was employed. The results indicate that the CatBoostClassifier model performs the best in mitigating overfitting, while the ensemble model demonstrates the optimal predictive performance. This discovery holds significant importance for enhancing the accuracy of detecting AI-generated articles.
Chucheng Zhou (Mon,) studied this question.