The use of artificial intelligence in scholarly publishing is quickly becoming a focal point as researchers, journal editors, and publishers determine how to balance the benefits of AI with the potential ethical drawbacks. We used four different AI text detectors to analyze one hundred articles published in economics journals in 2019, before the release of ChatGPT, and one hundred articles from the same journals published in 2023–24, after the release of ChatGPT. Using the AI likelihood scores generated from each detector as the dependent variable and controlling for the number of authors, word count, and journal fixed effects, we find that only ZeroGPT had AI likelihood scores that were 3.89 percentage points higher in the post-AI papers compared to the pre-AI papers; however, this result was only statistically significant at the 90 per cent level. The other AI detectors, Originality, Crossplag, and Winston, showed that pre- and post-AI papers had AI scores that were not statistically different when controlling for other factors. We also find that there is little to no correlation between the scores produced by the four detectors when scanning the same papers. This research highlights the fallibility of accurate AI detection and the need for further research as a foundation for developing future AI policies in scholarly publishing.
Kinney et al. (Thu,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: