June 1, 2020Open Access

Deep learning‐based classification and mutation prediction from histopathological images of hepatocellular carcinoma

Key Points

Key points are not available for this paper at this time.

Abstract

Despite of the fact that the diagnosis of hepatocellular carcinoma (HCC) mainly relies on noninvasive approaches including computerized tomography or magnetic resonance imaging,1, 2 evaluation of histopathology is still indispensable in the clinical care of patients, as pathology can not only allow for a definitive diagnosis but also provide significant prognostication information.3 Moreover, histological subtypes of HCC have been shown to be related to somatic mutation burdens,3, 4 which suggests the link between HCC molecular features and histological phenotypes. Recently, the association between the occurrence of activating mutations and the response to multiple tyrosine kinase inhibitors or immunotherapy has been established in HCC patients.5-8 Taken together, these findings support the establishment of personalized management for each HCC patient based on histopathology. However, visual inspection on tissue slides is typically performed at magnifications from 5× to 40× in an exhaustive manner, which makes it time-consuming for a pathologist to interpret the complexity of histopathological morphology.9 In this study, we constructed a convolutional neural network (CNN)-based platform using whole-slide images (WSIs) of hematoxylin and eosin (H Table S2). However, in the external validation set, 12.3% and 21.6% of tiles were misclassified at 5× magnification (Figure S3C, left panel) and 20× magnification (Figure S3C, right panel), respectively. Despite this fact, per-tile classification at 5× magnification (Table S2) also showed an AUC of 0.949, whereas classification at 20× magnification yielded an accuracy that is significantly lower than that at 5× magnification (AUC = 0.860; Figure S3D). In order to assess the classification accuracy on per-slide level, the per-tile classification results were aggregated using the two methods previously described to generate a per-slide classification. Both generating Methods 1 and 2 resulted in an almost error-free classification in the test set (Figures 2A and S4A; Table S2). Consistent with the results from per-tile classification, the AUCs achieved in the external validation set by both two methods at 5× magnification were significantly higher than those at 20× magnification (Figures 2B and S4B; Table S2). Nevertheless, dots labeled with “HCC” still demonstrated significantly higher probabilities of HCC diagnosis at 20× magnification when using the CNN classifier (Figures 2B and S4B, left panel; Table S2). Next, we analyzed the correlation between the results obtained from the two magnifications to investigate the agreement of per-slide (or per-dot) classification results achieved at different resolutions (5× vs 20×). It was found that in both the two validation sets (test and external validation), the classification results aggregated at these two magnifications were highly correlated (Figures 2C, 2D, S4C, and S4D). Although high consistency was observed between the results from the two resolutions when using a binary classifier in the test set (Figures 2E and 2G ), ∼50% of the TMA dots assigned with a “Normal” label at 5× magnification showed the opposite classification outcomes when 20× magnified tiles were used (Figures 2F and 2H). It is worth noticing that no significant correlation was found between the accuracy of the classification and the WSI (or TMA) size (Figure S5; Spearman's correlation coefficient 0.7 in both the two sets (Figure 3H; Tables S5 and S6). On the other hand, the other two mutations that were not predictable in the test set (TTN and PCLO) could also be predicted in the external validation set (Table S6). However, mutations of CTNNB1 and TP53, which were predicted with high AUCs in the test, could not be predicted at this stage (Table S6). These findings suggested that there were some important differences between WSIs and our TMA dots impacting the evaluation of the TCGA-based model. Despite this fact, box plot showed significant difference in the probability of HCC diagnosing between TP53-mutated samples and those wild-type ones (Figures 3E, S6C, and S7C). The distribution of probabilities on mutated and wild-type tiles for these four predictable in the external validation set also showed a higher percentage of positively classified tiles for each mutation in the mutated samples than that of wild-type ones (Figure S8B). In conclusion, we have provided with a promising perspective on HCC diagnosis using CNN, which unambiguously distinguished tumor from adjacent normal tissues using WSIs (highest AUC achieved at 1.000), which even outperformed the AUC of ∼0.99 achieved in our previous work using image features combined with random forest classifier.19 Regarding the performance on TMAs, there was a gain of ∼0.2 in AUC by the CNN model compared to results using feature-based approach at 20× magnification.19 Moreover, compared with Inception V3 model that showed excellent performance on WSIs,9 our models cost less memory and time and demonstrated higher prediction accuracy in both tasks 1 and 2 (Table S7; Figures S9 and S10). However, we noticed a significant difference in the results of task 1 between TMA dots at different resolutions (5× vs 20×). This finding might be attributed to the fact that, compared with feature extraction at 5× magnification, more tiles are inundated with some “misleading” features, such as air bubbles, dull staining, and uneven staining during TMA preparation, leading to a more ambiguous per-tile diagnosis of HCC, which in turn contributed to a more ambiguous per-dot HCC diagnosing. The discrepancy between the TCGA and WCH dataset using the mutation-prediction CNN might be owing to the fact that only the most representative view of each sample was used after pathologists browse through each region in WSIs during TMA construction, which might lead to the loss of significant information on the histopathological characteristics of tumor samples. Despite these, we do believe that our work will inspire further studies extending our classification model to the specific histological subtypes of HCC and predicting their genetic alterations. In the future, studies based on a large scale of HCC samples are also needed to retrain our CNN-based models and validate our findings. We would like to thank the TCGA working group for offering the slide images and the corresponding cancer information. We are most grateful for Core Facility of West China Hospital for their technique support on the experiments. This work was supported by grants from the National Key Technologies R&D Program (2018YFC1106800), the Natural Science Foundation of China (81972747, 81972204, 81872004, 81800564, 81770615, 81702327, 81700555, 81672882, 61702421, U1811262 and 61772426), the Science and Technology Support Program of Sichuan Province (2019YFQ0001, 2018SZ0115, 2017SZ0003), the Science and Technology Program of Tibet Autonomous Region (XZ201801-GB-02), the 1.3.5 project for disciplines of excellence, West China Hospital, Sichuan University (ZYJC18008), the Natural Science Foundation of Guangdong Province (2019A1515011097), the Innovation Program of Shenzhen (JCYJ20180508165208399), the Science and Technology Planning Project of Guangzhou (201904010089), and the international Postdoctoral Fellowship Program (20180029). YZ, JP, and KY conceptualized and designed the study. JP, YL, and HL developed the methodology. HL, YL, and RH acquired the data. HL, YL, WW, XS, ZW, ML, and LX analyzed and interpreted the data. HL, JP, KY, XL, and YZ were associated with writing, review, and/or revision of the paper. YZ, KY, and JP provided administrative, technical, or material support. LX, ZW, and ZZ performed pathological experiment. YZ, JP, and KY supervised the study. The authors declare no conflict of interest. Data are available upon reasonable request. The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request. Please note: The publisher is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Haotian Liao

Yuxi Long

Ruijiang Han

Journals

Clinical and Translational Medicine

SHILAP Revista de lepidopterología

Actions

Institutions

Sichuan University

Northwestern Polytechnical University

Shenzhen University

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Deep learning‐based classification and mutation prediction from histopathological images of hepatocellular carcinoma

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study

Also consider