- Gemini 2.5 Flash broadly matched clinical triage patterns; relational support and action orientation appeared across all suicide risk levels.
- Safety escalation elements increased with higher risk, while psychoeducation declined and coping skills appeared only for non-suicidal distress.
- Gemini failed to recognise implicit suicidality in ambiguous ideation; general-purpose LLMs useful for research and training but unsuitable for autonomous crisis intervention.
Stud Health Technol Inform. 2026 May 21;336:1009-1013. doi: 10.3233/SHTI260331.
ABSTRACT
Large language models (LLMs) are increasingly used for information seeking and self-guided support, raising safety concerns in high-risk contexts such as suicidal ideation. While prior work suggests LLMs can detect suicidal language, their ability to calibrate responses across suicide risk levels remains unclear. This pilot study evaluated Gemini 2.5 Flash using 60 Self-Directed Violence Classification System (SDVCS) clinical vignettes spanning non-suicidal distress (L0), suicidal ideation without plan (L1), and imminent suicide risk (L2) (n=20 per level). Outputs were coded for seven response elements: relational support (REL), reflection (REF), psychoeducation (PSY), coping skills (COP), action orientation (ACT), risk acknowledgment (RISK), and crisis resources (INFO). REL and ACT were common across levels. Safety escalation elements increased with risk levels whereas PSY declined with increasing risk and COP was present only at no suicidal risk level. Notably, the model failed to recognize implicit suicidality in an ambiguous ideation vignette. Overall, Gemini 2.5 Flash approximated clinical triage patterns but demonstrated critical limitations in detecting indirect suicidal expressions. These findings suggest that general-purpose LLMs may support supervised triage research and training but remain unsuitable for autonomous crisis intervention.
PMID:42175005 | DOI:10.3233/SHTI260331
AI Search
Share Evidence Blueprint

Search Google Scholar
Save as PDF

