• A machine learning model was developed as a mapping operation tool to support the early screening phase of the private equity decision making process. • K-Means and Hierarchical Clustering were compared for the generation of informational profiles, with XGBoost emerging as the most effective supervised algorithm. • Total assets and net financial position are the most influential drivers characterizing the distinction between firms. • Integration of Explainable AI techniques such as SHAP permit overcoming the opacity of black box models, promoting transparency and supporting the transition toward analytical reasoning, also known as System 2 thinking. This study develops a machine learning framework to support the early-stage screening of Italian small and medium enterprises (SMEs) by private equity (PE) investors through the construction of economically interpretable information clusters. The dataset consists of 240,705 firm-level observations from 16,047 technology-oriented Italian SMEs, combining balance sheet and income statement variables. The methodology applies two unsupervised clustering techniques, K-Means and Hierarchical Clustering, to identify groups of firms with similar economic and financial profiles. Six supervised machine learning models are subsequently trained to evaluate their ability to assign new firms to the identified clusters, with XGBoost consistently achieving the highest mapping performance even under class imbalance conditions. Results show that machine learning algorithms can effectively categorize firms into information clusters aligned with PE screening logic, with Total Assets and Net Financial Position emerging as key discriminating dimensions. The integration of explainable AI techniques, particularly SHAP, provides transparent insights into the contribution of each financial variable to the model’s decisions, addressing concerns related to model interpretability and enhancing investor trust. By organizing heterogeneous firm information into standardized profiles, the proposed framework reduces information asymmetry and supports more structured and consistent screening decisions under uncertainty. The contribution of this study is twofold. From a theoretical perspective, it bridges behavioral finance theory, selection determinants and data-driven approaches in private equity, demonstrating how machine learning can complement rather than replace investor judgment. From a practical standpoint, it offers a transparent and interpretable decision-support tool that may help mitigate reliance on heuristics and subjective biases during the initial evaluation of investment opportunities.
Maccagni et al. (Tue,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: