Key points are not available for this paper at this time.
With the rapid progress of AI in both academia and industry, Deep Learning has been widely introduced into various areas in drug discovery to accelerate its pace and cut R&D costs. Among all the problems in drug discovery, molecular property prediction has been one of the most important problems. Unlike general Deep Learning applications, the scale of labeled data is limited in molecular property prediction. To better solve this problem, Deep Learning methods have started focusing on how to utilize tremendous unlabeled data to improve the prediction performance on small-scale labeled data. In this paper, we propose a semi-supervised model named SMILES-BERT, which consists of attention mechanism based Transformer Layer. A large-scale unlabeled data has been used to pre-train the model through a Masked SMILES Recovery task. Then the pre-trained model could easily be generalized into different molecular property prediction tasks via fine-tuning. In the experiments, the proposed SMILES-BERT outperforms the state-of-the-art methods on all three datasets, showing the effectiveness of our unsupervised pre-training and great generalization capability of the pre-trained model.
Building similarity graph...
Analyzing shared references across papers
Loading...
Sheng Wang
Ningbo University
Yuzhi Guo
The University of Texas at Arlington
Yuhong Wang
National Center for Advancing Translational Sciences
The University of Texas at Arlington
Building similarity graph...
Analyzing shared references across papers
Loading...
Wang et al. (Wed,) studied this question.
synapsesocial.com/papers/69da229a00ab073a27837ce1 — DOI: https://doi.org/10.1145/3307339.3342186