This study evaluates whether generalized large language models (LLMs) can independently complete all tasks required in an undergraduate financial accounting course. Three leading LLMs, including ChatGPT 5, Gemini 2.5, and Claude Sonnet 4.5, performed every assignment, exam, and project in ACCT 2301 at Lamar University without human prompting beyond the attachment of required work. Results show near-perfect accuracy on structured tasks, with all three models achieving final course averages exceeding 96%, compared to the student average of 78.4. Performance weaknesses emerged on the multi-step Jackson Cycle project, aligning with prior Artificial Intelligence (AI) literature on long-horizon reasoning limitations. Broader implications for accounting education, labor markets, assessment design, and AI governance are discussed.
Orrin Swift (Mon,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: