Violence in mental health care, encompassing both self-directed behaviors such as suicidal ideation and attempts and other-directed behaviors including aggression and assault, represents a major clinical and public health challenge. Traditional violence risk assessment tools and structured clinical data offer limited predictive accuracy and often fail to capture the dynamic, contextual, and linguistic signals embedded in clinical narratives. Natural language processing (NLP) has emerged as a promising approach to leverage free-text electronic medical record notes for more precise and timely violence-risk prediction. This systematic review synthesizes evidence on the use and effectiveness of NLP-based models applied to unstructured clinical text to predict self-directed and other-directed violence in health care populations, with the aim of informing both the research evidence base and the clinical readiness of these tools. Following PRISMA guidelines and a registered PROSPERO protocol, comprehensive searches identified 21 eligible studies spanning diverse clinical settings, populations, outcomes, and modeling strategies. Across studies, NLP-enhanced models consistently outperformed structured-data–only approaches, with area under the receiver operating characteristic curve values frequently exceeding 0.80 for self-directed outcomes and demonstrating meaningful gains for aggression-related predictions. Performance improvements were particularly pronounced for short-term and near-event prediction horizons. Methodological approaches varied substantially with respect to preprocessing pipelines, embedding techniques, algorithms, validation strategies, and outcome definitions, limiting direct comparability and meta-analytic synthesis. Reporting of calibration, interpretability, fairness, and external validation was inconsistent, and many studies exhibited moderate to high risk of bias under PROBAST and PROBAST-AI criteria. Taken together, the findings indicate that free-text clinical notes contain clinically meaningful signals not captured by structured data alone, but also highlight important constraints on immediate clinical deployment. This review aims to support clinicians, health systems, and researchers in interpreting current model performance, understanding residual risks, and identifying the methodological and governance requirements necessary for safe and effective clinical implementation of NLP-based violence-risk prediction tools.
Grenier et al. (Wed,) studied this question.