December 1, 2018

Research on Patent Text Classification Based on Word2Vec and LSTM

Key Points

Key points are not available for this paper at this time.

Abstract

In order to efficiently classify patent texts in the security field, a patent text classification model based on Word2Vec and Long-short term memory (LSTM) was established. Combined with the features of the patent text, first of all, in the text pre-processing process, words frequently appearing in patent documents such as “the invention”, “involvement”, and “utility model” were added to the stop word list to save storage space and improve efficiency; Secondly, the pre-trained word2vec model was introduced to solve the dimensional disaster caused by the traditional methods. Finally, by training the LSTM classification model, text features were extracted and patent text classification in the security field was performed. 50,000 patent documents were divided into the training set and the test set according to the ratio of 4:1, and the accuracy and ROC curve evaluation model were used to analyze and evaluate the classification results. The results showed that the classification accuracy rate of this method is 93.48%. At the same time, the LSTM classification model, K Nearest Neighbor (KNN) classification model, Convolutional Neural Network (CNN) classification model, and models based on CNN and Word2Vec were further compared. The experimental results showed that this method can better classify the patent texts in the security field, laying the foundation for further research and effective use of patents.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Lizhong Xiao

Shanghai Institute of Technology

Guang‐Zhong Wang

Shanghai Institute of Nutrition and Health

Yang Zuo

Lanzhou University of Technology

Actions

Institutions

Shanghai Institute of Technology

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Research on Patent Text Classification Based on Word2Vec and LSTM

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study