The ancient Chinese texts exhibit marked intertextual characteristics, where scholars engage in citation, reinterpretation, and reconstruction of earlier works, forming an intellectual lineage spanning millennia. With advances in digital humanities, automated detection of text reuse in vast classical corpora has become feasible. However, existing algorithms remain largely confined to surface-level character matching, posing persistent challenges in identifying deep semantic correlations. To address this problem, we propose a novel text reuse detection method based on knowledge distillation for ancient Chinese literature which significantly enhances semantic understanding of classical texts while maintaining computational efficiency. Additionally, we construct a high-quality annotated dataset to establish a reliable benchmark for algorithmic evaluation. Through concrete case studies, we demonstrate the method’s applicability in cultural analysis, offering a novel technical pathway for the digitization and intelligent analysis of cultural heritage.
Fu et al. (Sat,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: