The telecommunications industry is characterized by intense competition and rapid technological evolution, making financial stability a critical factor for sustained growth. This work focuses on leveraging machine learning techniques to analyze and predict customer payment behavior within a Portuguese telecommunications company, aiming to reduce financial losses associated with unpaid debts. Using the CRISP-DM methodology, the project first develops supervised learning models to predict whether customers will remain good payers, based solely on internal data. Among the algorithms tested, Random Forest achieved the highest accuracy of 99%, enabling early identification of potential defaulters. Complementing this, unsupervised learning methods, specifically Principal Component Analysis for dimensionality reduction and K-Means clustering, uncover hidden behavioral segments within the customer base. The optimal clustering identified five distinct groups, some of which show near-homogeneous target values (close to 0 or 1), allowing for strong characterization of compliant and non-compliant profiles. The findings demonstrate the effectiveness of combining supervised and unsupervised learning for risk analysis. Supervised models allow scenario testing by altering feature values to simulate changes in payment behavior. In unsupervised learning, analyzing ambiguous clusters through comparison with more definitive ones helps estimate likely client outcomes and supports proactive management. Future work may explore focused clustering of non-compliant clients, alternative data preprocessing, and time series forecasting to further improve predictive accuracy and operational utility.
Arsénio et al. (Thu,) studied this question.