With the advancement of computer science, software has become increasingly prevalent across all facets of society, making software quality issues a focal point of industry concern. The scarcity of sufficient defect data in the early stages of projects undermines prediction accuracy, driving research into cross-project software defect prediction. The traditional manual measurement features face challenges due to the data distribution discrepancies between original and cross-project contexts, which hinder the prediction effectiveness. Furthermore, single features fail to comprehensively characterize software information. This paper proposes a domain adaptation and feature fusion-based cross-project software defect prediction method (DAFF-CPDP). The model employs the TCA+ algorithm for domain adaptation and utilizes an encoder layer for progressive feature fusion. Multiple Java projects were selected for evaluation. The comparisons with various baseline models demonstrated that the proposed model outperforms both the traditional machine learning-based feature models and the diverse deep learning-based single-feature or multi-feature models. Concurrently, this paper analyzes the impact of different source projects on target projects, confirming that class-balanced datasets and datasets with smaller distribution differences are more conducive to project prediction.
Guo et al. (Thu,) studied this question.