What question did this study set out to answer?

The aim is to evaluate how effective large language models can be in supporting mental health diagnostic assessments.

April 19, 2026

Exploring The Potential of Large Language Models for Assisting with Mental Health Diagnostic Assessments

Puntos clave

The aim is to evaluate how effective large language models can be in supporting mental health diagnostic assessments.
Examined diagnostic processes from PHQ-9 and GAD-7 questionnaires.
Investigated prompting and fine-tuning techniques for various LLMs.
Evaluated agreement between LLM outcomes and expert-validated diagnoses.
Found that fine-tuned models showed improved accuracy in diagnostic outcomes.
Proprietary and open-source LLMs demonstrated varying levels of adherence to standard procedures.
Agreement rates between LLM-generated outcomes and expert assessments were quantified.

Resumen

Large language models (LLMs) are increasingly attracting the attention of healthcare professionals for their potential to assist in diagnostic assessments, which could alleviate the strain on the healthcare system caused by a high patient load and a shortage of providers. For LLMs to be effective in supporting diagnostic assessments, it is essential that they closely replicate the standard diagnostic procedures used by clinicians. In this paper, we specifically examine the diagnostic assessment processes described in the Patient Health Questionnaire-9 (PHQ-9) for major depressive disorder (MDD) and the Generalized Anxiety Disorder-7 (GAD-7) questionnaire for generalized anxiety disorder (GAD). We investigate various prompting and fine-tuning techniques to guide both proprietary and open-source LLMs in adhering to these processes, and we evaluate the agreement between LLMgenerated diagnostic outcomes and expert-validated ground truth. For fine-tuning, we utilize the Mentalllama and Llama models, while for prompting, we experiment with proprietary models like GPT-3.5 and GPT-4o, as well as open-source models such as llama-3.1-8b and mixtral-8x7b. Software Availability . We make all software artifacts available at this Github link Institutional Review Board (IRB) . This study does not require approval from the Institutional Review Board (IRB). It involves using clinician-annotated social media posts, authorized for research purposes. The primary objective is to evaluate the effectiveness of LLMs that incorporate diagnostic criteria for major depressive disorder and general anxiety disorder for assisting with mental health assessments.

Me gusta

Guardar

Cite This Study

Roy et al. (Fri,) studied this question.

synapsesocial.com/papers/69e4741c010ef96374d8fd6f https://doi.org/https://doi.org/10.1145/3805697

Also Consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

Me gusta

Guardar