March 15, 2024

Chinese Named Entity Recognition Based on BERT and Grouped-query Attention

Puntos clave

Los puntos clave no están disponibles para este artículo en este momento.

Resumen

Named entity recognition (NER) is an important task in Natural Language Processing (NLP), and Chinese NER is more difficult than English NER. Currently, machine learning and deep learning methods have been widely used in NER research. Traditional NER methods often ignore long-distance syntactic dependencies between words, and word preprocessing of such models usually ignores the contextual semantic information of the target word and fails to realize its polysemy. To address these shortcomings, this paper proposes a BERT-BiLSTM-GQA-CRF model for Chinese NER. BERT preprocessing language model generates word vectors representing contextual semantic information, inputs the sequence of trained word vectors into the BiLSTM layer embedded with Grouped-query attention (GQA) to obtain the overall semantic information, and finally decodes the sequence of entity tokens through the CRF layer. Compared to Multi-head attention (MHA), Multi-query attention (MQA) uses only a single key-value head, which can greatly speed up decoder inference. However, MQA leads to quality degradation, so in this paper, we use GQA to make the model more accurate than MHA with a speed comparable to MQA. Experimental results on the Weibo corpus and MSRA corpus show that the improved model improves both recognition effectiveness and speed.

Preguntar a la IA

Me gusta

Guardar