ABSTRACT Advances in artificial intelligence (AI) and large language models (LLMs) are transforming materials research by enabling automated data extraction, knowledge integration, and property prediction. This study presents a dual‐stage, LLM‐assisted framework for magnesium alloy design that combines semantic extraction, thermodynamic reasoning, and machine learning (ML). Using Qwen‐2.5, alloy chemistry, processing details, and thermal and mechanical property data are automatically extracted from full‐text literature and converted into structured records. The extracted information is expanded with thermodynamic phase descriptors predicted by DeepSeek‐R1 and numerical processing features generated from CLIP‐based embeddings. The feature set is optimized using sequential backward selection (SBS), and predictive models are developed using support vector machines (SVM), random forest (RF), and eXtreme Gradient Boosting (XGB). The proposed workflow effectively integrates chemistry, thermodynamics, and processing history, achieving robust predictions for thermal conductivity, yield strength, and ultimate tensile strength. The best performing models yielded R 2 values of ∼ 0.80 (RMSE ∼ 9.98 W·m −1 K −1 ), ∼ 0.69 (RMSE ∼ 37.2 MPa), and ∼ 0.73 (RMSE ∼ 31.5 MPa) for TC, YS, and UTS, respectively. Validation against CALPHAD calculations shows that DeepSeek‐R1 reproduces equilibrium phase fractions within 1 wt.% deviation. Overall, this work shows that semantic intelligence can link literature‐derived knowledge with predictive modeling, providing a pathway for processing‐informed alloy design.
Lu et al. (Sun,) studied this question.