The digitalization of traditional Chinese culture highlights the need for accurate poem theme identification, a key task of NLP. However, the progress is hindered by the scarcity of dedicated ancient Chinese corpora and poor adaptability of general-purpose models. To address these issues, a novel ancient poetry theme recognition method named BA-RILA (BACM-Attention-Rhythm-Imagery-BiLSTM-Aided Framework) is proposed in this paper. A three-dimensional multi-dimensional text feature fusion framework of semantics-rhythm-imagery is designed. Semantic vectors are extracted using the optimized BERT ancient Chinese pre-trained model (BACM), while 11-dimensional rhythmic features and 75-dimensional imagery features are synchronously extracted. To fuse these heterogeneous features complementarily, this research firstly align their dimensions and then apply an attention-weighting scheme. Next, the fused features are passed to a two-layer BiLSTM to capture deep temporal-semantic dependencies within the text. Finally, an 8-head multi-head attention (MHA) mechanism refines the features by dynamically strengthening the weights of the most salient elements.Finally, a four-layer fully connected classifier maps the enhanced features to a six-category theme space, and the category probability distribution is output through Softmax. Experimental verification shows that this method significantly outperforms benchmark models and exhibits strong generalization ability on cross-Tang and Song dynasties datasets.
Zhang et al. (Thu,) studied this question.