September 4, 2019

Smiles-Bert

Key Points

Key points are not available for this paper at this time.

Abstract

With the rapid progress of AI in both academia and industry, Deep Learning has been widely introduced into various areas in drug discovery to accelerate its pace and cut R&D costs. Among all the problems in drug discovery, molecular property prediction has been one of the most important problems. Unlike general Deep Learning applications, the scale of labeled data is limited in molecular property prediction. To better solve this problem, Deep Learning methods have started focusing on how to utilize tremendous unlabeled data to improve the prediction performance on small-scale labeled data. In this paper, we propose a semi-supervised model named SMILES-BERT, which consists of attention mechanism based Transformer Layer. A large-scale unlabeled data has been used to pre-train the model through a Masked SMILES Recovery task. Then the pre-trained model could easily be generalized into different molecular property prediction tasks via fine-tuning. In the experiments, the proposed SMILES-BERT outperforms the state-of-the-art methods on all three datasets, showing the effectiveness of our unsupervised pre-training and great generalization capability of the pre-trained model.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Sheng Wang

Ningbo University

Yuzhi Guo

The University of Texas at Arlington

Yuhong Wang

National Center for Advancing Translational Sciences

Actions

Institutions

The University of Texas at Arlington

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Smiles-Bert

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study