The importance of understanding consumer behavior in transaction data has become a key to improving marketing efficiency. This study aims to explore the application of machine learning (ML) techniques for data-driven consumer segmentation, focusing on improving product marketing strategies. This work addresses the limitations in the existing literature, especially in terms of handling high-dimensional data that can reduce segmentation quality. Previously, various studies have used clustering algorithms such as K-means without considering dimensionality reduction, which often leads to decreased accuracy and long computation time. In this study, we propose a new approach that combines principal component analysis (PCA) for dimensionality reduction and K-means clustering for consumer segmentation based on purchasing behavior. Experimental results show that using PCA to reduce data dimensionality significantly improves segmentation quality with an inertia score of 1,455,650 and a silhouette score of 0.486366. By implementing this method, we can group consumers into three segments based on frequently purchased product categories and the most common payment methods. These findings provide a scalable, data-driven segmentation framework that can be applied to improve marketing effectiveness by providing special discounts on various products based on the payment method used.
Purnamasari et al. (Fri,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: