What question did this study set out to answer?

The research aims to improve the identification of andesite tectonic environments using advanced machine learning techniques.

February 9, 2026Open Access

A dual-track machine learning framework for andesite tectonic environment identification: ensemble learning and few-shot learning

Key Points

The research aims to improve the identification of andesite tectonic environments using advanced machine learning techniques.
Developed a dual-track framework integrating machine learning and few-shot learning.
Utilized 26,463 samples from the GEOROC database.
Employed ensemble models like Random Forest, XGBoost, and LightGBM for large-sample analysis.
Used meta-learning and knowledge distillation to optimize performance on rare tectonic types.
Achieved high precision with an AUC ≥ 0.99 for large samples.
LightGBM yielded a 97% recall rate for small-sample rare tectonic types.
Boosted recall rates of rare types to 99% using the meta-learning framework.
Identified key discriminant geochemical elements supporting classical magmatic theories.

Abstract

Accurate discrimination of andesite tectonic settings is critical for unraveling Earth’s geodynamic processes. However, existing studies face three key challenges: (1) simplified traditional methods, which rely on single-element ratios and fail to capture the complex petrogenetic processes of andesites; (2) poor performance on small samples, as rare tectonic types (RV) are often misclassified owing to data scarcity; and (3) limited geological interpretability, with most models lacking clear links between geochemical features and magmatic mechanisms. To address these issues, we propose a “dual-track” framework integrating machine learning and few-shot learning using 26,463 andesite samples from the GEOROC database. For large-sample scenarios, optimized ensemble models (Random Forest, XGBoost, LightGBM) achieve high precision, with an Area Under the Receiver Operating Characteristic Curve (AUC, a metric reflecting overall classification performance) ≥ 0.99. LightGBM emerges as the dominant model, with a recall rate of 97% for small-sample RV. For rare tectonic types, a meta-learning (TabPFN pre-training) and knowledge distillation (transfer to CatBoost) framework boosts the recall rates of RV and OI to 99% while optimizing the inference speed to 0.01 seconds per sample. SHAP analysis identifies key discriminant elements (e.g., TiO2 and FeOt for CM; Nb and Lu for OI) and their synergistic effects, verifying classical magmatic theories (e.g., Fe-Ti oxide differentiation in subduction zones). This framework provides a reproducible standard for intermediate igneous rock classification, aiding paleotectonic reconstruction and mineral exploration in the future.

Read Full Paperexternally

Bookmark

View Full Paper

Cite This Study

Li et al. (Wed,) studied this question.

synapsesocial.com/papers/69897983f0ec2af6756e73e2 https://doi.org/https://doi.org/10.1080/20964471.2025.2603810

Bookmark

View Full Paper