March 3, 2026Open Access

A dataset and benchmark of carbonate thin-section images for deep learning

Key Points

Results highlight the dataset's value as a robust benchmark for carbonate petrography research and applications.
The dataset includes 22 lithological categories organized by optical mode, ensuring standardized evaluation.
Samples were collected from multiple geological formations in China and the UAE to enhance dataset diversity.
Evaluation involved several established deep learning models like ResNet and DenseNet, demonstrating comprehensive testing.

Abstract

Deep learning has become a key tool for carbonate thin-section image analysis. However, the lack of large public datasets limits reproducibility and fair model comparison. To address this, we present DeepCarbonate, a cleaned and standardized benchmark dataset. Samples were collected from the Ediacaran Dengying, Cambrian Longwangmiao, and Triassic Leikoupo and Jialingjiang Formations in the Sichuan Basin, China, and the Cretaceous Mishrif Formation in the UAE. The dataset was curated by petroleum geology experts; invalid images (blurred, low brightness, or corrupted) were removed through expert voting and 2σ filtering, and all images were reorganized in the ImageNet format. DeepCarbonate contains 22 lithological categories, hierarchically organized by optical mode (PPL, XPL, R) and split into train, validation, and test subsets, ensuring standardized benchmarking and reproducible experiments. Using PyTorch with CUDA acceleration, we evaluated ResNet, VGG, DenseNet, MobileNet, and EfficientNet models under baseline, ablation, long tailed distribution, and balanced Top 9 subset experiments. Results highlight the dataset's value as a robust benchmark for carbonate petrography research and applications.

Bookmark

View Full Paper

Cite This Study

Li et al. (Wed,) studied this question.

synapsesocial.com/papers/69a75c59c6e9836116a252ba https://doi.org/https://doi.org/10.1038/s41597-026-06633-5

Bookmark

View Full Paper