Large Language Models (LLMs) have successfully performed calculation-based tasks, generated diverse data visualizations, and executed chemometric analyses. This study systematically evaluated the performance of Microsoft M365 Copilot (GPT-5) across 35 representative questions spanning five domains: (1) chemical equilibrium, pH, titration, and buffer calculations; (2) data visualization, including histograms, box plots, correlation plots, and heatmaps; (3) analysis of periodic table properties using principal component analysis (PCA); (4) image interpretation and generation in classroom contexts; and (5) machine learning applications using Partial Least Squares Discriminant Analysis (PLS-DA). All questions were assessed without the use of additional prompting. Across two independent user accounts, identical question sets were administered twice per month between October and December 2025. Copilot consistently produced accurate, step-by-step solutions for equilibrium and acid–base problems, generated high-quality visualizations directly from uploaded datasets, and correctly constructed PCA score and loading plots with appropriate data standardization. Collectively, these findings demonstrate that Copilot offers substantial value for both research-oriented tasks and chemistry education.
Pereira et al. (Fri,) studied this question.