In the context of the cross-disciplinary integration of data science and archival management, archival openness auditing stands as a critical process for public information access but faces challenges in processing long texts with sparse core information. To address this, this paper proposes an Assisted Archival Auditing Model (ALC-MCFN) based on deep semantic understanding and decision transparency. The model aims to leverage intelligent analytics to optimize the decision-making process of archival openness. Regarding deep semantic understanding, a semantic-aware dynamic truncation mechanism is first employed to effectively remove redundancy while preserving key logical structures. Subsequently, by fusing global, local, and logical semantic features extracted by BERT, TextCNN, and TextGCN, the model overcomes the limitations of single-view feature representation. Furthermore, to address the “black box” issue of deep learning in compliance auditing, the SHAP method is introduced to provide post hoc interpretability. By visualizing the contribution of key textual features to the auditing results, the model enhances the transparency and trustworthiness of decision-making. Experimental results demonstrate that ALC-MCFN outperforms mainstream baseline models, with a 77.21% F1-score on the self-built archival domain OParchives dataset (1.15 percentage points higher than the BERT baseline), providing robust data science support for risk control and efficiency improvement in intelligent archival management.
Feng et al. (Tue,) studied this question.