Scientific and accurate agronomic knowledge is key to ensuring efficient wheat production. China’s vast agricultural land spans a wide range of longitudes and latitudes, and agronomic practices are closely tied to temporal factors such as wheat growth stages. So agronomic knowledge exhibits significant spatiotemporal variability. Constructing a spatiotemporal knowledge graph of wheat production can offer multi-dimensional data support and enabling deeper knowledge services. Wheat agronomic knowledge is often fragmented and unstructured and efficiently extracting text segments of agronomic knowledge and agronomic knowledge triples are two key challenges. Because of the high proportion and significant production service value of attribute values in agronomic knowledge, an attribute-rich agronomic knowledge graph schema was created. According to the characteristics of agronomic texts, a keyword attention mechanism (KAM) was proposed and integrated with an improved BERT model for sentence-level feature extraction to create an extraction model AgronomicCorpusExtraction for agronomic knowledge text corpora. The agronomic knowledge of wheat production is characterized by non-standard syntax, complex multi-layer structures, diverse entity expression methods, and a wide span of scope, and existing extraction methods cannot achieve satisfactory results. To address the issue, a joint extraction model AgronomicTripleExtraction was proposed to extract entities, attributes, and relations in different phrases, firstly the BERT and BiGRU were used jointly to extract the long and short distance features, and the CRF was used by global normalization joint modeling to extract attributes, then intermediate features between the same type of attributes extracted by average pooling to segment different entities. At last, a relation-aware relation feature enhancement (RAFE) method was created and a MLP was used to extract relations based on the relation matrix constructed from the knowledge graph schema. Ablation experiments were conducted to evaluate the performance for AgronomicCorpusExtraction with and without KAM and that for AgronomicTripleExtraction under four conditions, the model with BiGRU, RAFE, and entity segment, without BiGRU, without RAFF, and without entity segment. The results indicate that the use of KAM improves F1-score by 0.128 and AgronomicTripleExtraction achieves F1 of 0.897, 0.875, 0.871 for attribute, entity and relation extraction when using the three modules simultaneously, and removing any single module leads to a certain degree of performance degradation. Comparative experiments were conducted between AgronomicTripleExtraction and some related state-of-the-art models published recently.
Guo et al. (Mon,) studied this question.