What type of study is this?

This is a Quantitative Study study.

October 23, 2025Open Access

Decrypting Complexity: A Tri-Metric Evaluation of Readability and Fidelity in AI-Simplified Scientific Texts for ESL University Learners

Key Points

Readability of scientific texts improves, with AI tools enhancing comprehension for ESL learners.
The study highlights semantic similarity, resulting in lower Flesch-Kincaid Grades and better Flesch Reading Ease scores after simplification.
Assessment through latent semantic analysis and correlation analysis informs on reliability and potential biases in AI applications.
Implications call for careful integration of AI tools, highlighting weaknesses in the correlation between AI and human evaluations. You should validate with experts' insights.

Abstract

Undergraduate university students in ESL contexts often need enhanced readability of complex scientific articles in research journals. This study aimed to assess the efficacy and "toolability" of AI-based Chat-GPT in readability amplification of research abstracts in language and linguistics journals, indexed in Scopus and Web of Science. Robust latent semantic analysis (LSA), with vectorial space document-embedding, was performed to evaluate co-occurrence and notional preservation. One hundred abstracts (n = 100), extracted from four journals, were prompted into an open Chat-GPT 4.o session for simplification at undergraduate level ESL users. Three metrics, Flesch-Kincaid Grade Level, Flesch Reading Ease and McAlpine EFLAW were used for readability measurement at pre-transformation and post-transformation stages. The content fidelity in the input and output models were determined by latent semantic analysis recorded from 0 to 1 of the fidelity range. To rule out bias, objective evaluation by field experts was performed on a randomly extracted subgroup (n = 50). Further, t-tests and correlation analysis were conducted for comparing estimations and accuracy evaluation. The findings showed adequate semantic similarity and fidelity, almost overruling post-simplification semantic disruption. The readability increased, with a low Flesch-Kincaid Grade, high Flesch- Kincaid Ease and representative EFLAW score. However, weak correlation of LSA and field experts' estimations warranted caution and human-AI contra-estimations. The study offers micro-, meso- and macro-implications for incorporating AI in scientific reading comprehension, given caution is practiced with unsupervised dependence. Future research may involve other metrics like BERTScore, robust mixed research designs, comparative cognitive protocols evaluation of texts and other AI models.

Decrypting Complexity: A Tri-Metric Evaluation of Readability and Fidelity in AI-Simplified Scientific Texts for ESL University Learners

Key Points

Abstract

Cite This Study

Also Consider

Also Consider