Key points are not available for this paper at this time.
In order to efficiently classify patent texts in the security field, a patent text classification model based on Word2Vec and Long-short term memory (LSTM) was established. Combined with the features of the patent text, first of all, in the text pre-processing process, words frequently appearing in patent documents such as “the invention”, “involvement”, and “utility model” were added to the stop word list to save storage space and improve efficiency; Secondly, the pre-trained word2vec model was introduced to solve the dimensional disaster caused by the traditional methods. Finally, by training the LSTM classification model, text features were extracted and patent text classification in the security field was performed. 50,000 patent documents were divided into the training set and the test set according to the ratio of 4:1, and the accuracy and ROC curve evaluation model were used to analyze and evaluate the classification results. The results showed that the classification accuracy rate of this method is 93.48%. At the same time, the LSTM classification model, K Nearest Neighbor (KNN) classification model, Convolutional Neural Network (CNN) classification model, and models based on CNN and Word2Vec were further compared. The experimental results showed that this method can better classify the patent texts in the security field, laying the foundation for further research and effective use of patents.
Building similarity graph...
Analyzing shared references across papers
Loading...
Lizhong Xiao
Shanghai Institute of Technology
Guang‐Zhong Wang
Shanghai Institute of Nutrition and Health
Yang Zuo
Lanzhou University of Technology
Shanghai Institute of Technology
Building similarity graph...
Analyzing shared references across papers
Loading...
Xiao et al. (Sat,) studied this question.
synapsesocial.com/papers/6a157b2ba2352da347827fc2 — DOI: https://doi.org/10.1109/iscid.2018.00023