Welcome to PsychiatryAI.com: [PubMed] - Psychiatry AI Latest

Lumbar disc herniation with radiculopathy: a comparison of NASS guidelines and ChatGPT

Evidence

N Am Spine Soc J. 2024 Jun 1;19:100333. doi: 10.1016/j.xnsj.2024.100333. eCollection 2024 Sep.

ABSTRACT

BACKGROUND: ChatGPT is an advanced language AI able to generate responses to clinical questions regarding lumbar disc herniation with radiculopathy. Artificial intelligence (AI) tools are increasingly being considered to assist clinicians in decision-making. This study compared ChatGPT-3.5 and ChatGPT-4.0 responses to established NASS clinical guidelines and evaluated concordance.

METHODS: ChatGPT-3.5 and ChatGPT-4.0 were prompted with fifteen questions from The 2012 NASS Clinical Guidelines for the diagnosis and treatment of lumbar disc herniation with radiculopathy. Clinical questions organized into categories were directly entered as unmodified queries into ChatGPT. Language output was assessed by two independent authors on September 26, 2023 based on operationally-defined parameters of accuracy, over-conclusiveness, supplementary, and incompleteness. ChatGPT-3.5 and ChatGPT-4.0 performance was compared via chi-square analyses.

RESULTS: Among the fifteen responses produced by ChatGPT-3.5, 7 (47%) were accurate, 7 (47%) were over-conclusive, fifteen (100%) were supplementary, and 6 (40%) were incomplete. For ChatGPT-4.0, ten (67%) were accurate, 5 (33%) were over-conclusive, 10 (67%) were supplementary, and 6 (40%) were incomplete. There was a statistically significant difference in supplementary information (100% vs. 67%; p=.014) between ChatGPT-3.5 and ChatGPT-4.0. Accuracy (47% vs. 67%; p=.269), over-conclusiveness (47% vs. 33%; p=.456), and incompleteness (40% vs. 40%; p=1.000) did not show significant differences between ChatGPT-3.5 and ChatGPT-4.0. ChatGPT-3.5 and ChatGPT-4.0 both yielded 100% accuracy for definition and history and physical examination categories. Diagnostic testing yielded 0% accuracy for ChatGPT-3.5 and 100% accuracy for ChatGPT-4.0. Nonsurgical interventions had 50% accuracy for ChatGPT-3.5 and 63% accuracy for ChatGPT-4.0. Surgical interventions resulted in 0% accuracy for ChatGPT-3.5 and 33% accuracy for ChatGPT-4.0.

CONCLUSIONS: ChatGPT-4.0 provided less supplementary information and overall higher accuracy in question categories than ChatGPT-3.5. ChatGPT showed reasonable concordance to NASS guidelines, but clinicians should caution use of ChatGPT in its current state as it fails to safeguard against misinformation.

PMID:39040948 | PMC:PMC11261487 | DOI:10.1016/j.xnsj.2024.100333

Document this CPD Copy URL Button

Google

Google Keep

LinkedIn Share Share on Linkedin

Estimated reading time: 5 minute(s)

Latest: Psychiatryai.com #RAISR4D Evidence

Cool Evidence: Engaging Young People and Students in Real-World Evidence

Real-Time Evidence Search [Psychiatry]

AI Research

Lumbar disc herniation with radiculopathy: a comparison of NASS guidelines and ChatGPT

Copy WordPress Title

🌐 90 Days

Evidence Blueprint

Lumbar disc herniation with radiculopathy: a comparison of NASS guidelines and ChatGPT

QR Code

☊ AI-Driven Related Evidence Nodes

(recent articles with at least 5 words in title)

More Evidence

Lumbar disc herniation with radiculopathy: a comparison of NASS guidelines and ChatGPT

🌐 365 Days

Floating Tab
close chatgpt icon
ChatGPT

Enter your request.

Psychiatry AI RAISR 4D System Psychiatry + Mental Health