Evidence
Mod Pathol. 2024 Jul 16:100563. doi: 10.1016/j.modpat.2024.100563. Online ahead of print.
ABSTRACT
The biopsy Gleason score is an important prognostic marker for prostate cancer patients. It is, however, subject to substantial variability among pathologists. Artificial intelligence (AI)-based algorithms employing deep learning have shown their ability to match pathologists’ performance in assigning Gleason scores, with the potential to enhance pathologists’ grading accuracy. The performance of Gleason AI algorithms in research is mostly reported on common benchmark datasets or within public challenges. In contrast, many commercial algorithms are evaluated in clinical studies, for which data is not publicly released. As commercial AI vendors typically do not publish performance on public benchmarks, comparison between research and commercial AI is difficult. The aim of this study is to evaluate and compare the performance of top-ranked public and commercial algorithms using real-world data. We curated a diverse dataset of whole-slide prostate biopsy images through crowdsourcing, containing images with a range of Gleason scores and from diverse sources. Predictions were obtained from five top-ranked public algorithms from the PANDA challenge and from two commercial Gleason grading algorithms. Additionally, ten pathologists evaluated the data set in a reader study. Overall, the pairwise quadratic weighted kappa among pathologists ranged from 0.777 to 0.916. Both public and commercial algorithms showed high agreement with pathologists, with quadratic kappa ranging from 0.617 to 0.900. Commercial algorithms performed on par or outperformed top public algorithms.
PMID:39025402 | DOI:10.1016/j.modpat.2024.100563
Estimated reading time: 4 minute(s)
Latest: Psychiatryai.com #RAISR4D Evidence
Cool Evidence: Engaging Young People and Students in Real-World Evidence
Real-Time Evidence Search [Psychiatry]
AI Research
Evaluation of AI-based Gleason grading algorithms “in the wild”
🌐 90 Days
Evidence Blueprint
Evaluation of AI-based Gleason grading algorithms “in the wild”
☊ AI-Driven Related Evidence Nodes
(recent articles with at least 5 words in title)
More Evidence