Artificial intelligence (AI), particularly large language models like GPT-4o, holds promise for enhancing diagnostic accuracy in healthcare. This study evaluates the diagnostic performance of GPT-4o compared to human ophthalmologists in glaucoma cases. A prospective, observational study was conducted at a tertiary care ophthalmology center. Twenty-six glaucoma cases, including both primary and secondary types, were selected from publicly available databases and institutional records. The cases were analyzed by GPT-4o and three ophthalmologists with varying levels of experience. The accuracy and completeness of primary and differential diagnoses were assessed using 10-point and 6-point Likert scales, respectively. Statistical analyses were performed using nonparametric methods, including the Kruskal-Wallis and Mann-Whitney U tests. GPT-4o was significantly less accurate in primary diagnosis compared to human ophthalmologists. Specifically, GPT-4o achieved a mean score of 5.500 (p < 0.001) compared to Doctor C, who had the highest score of 8.038 (p < 0.001). Completeness scores for GPT-4o 3.077 (p < 0.001) were also lower than Doctor B, who had the lowest score of 3.615 (p < 0.001) among human ophthalmologists. However, for differential diagnosis, GPT-4o (7.577) showed comparable accuracy to Doctor A (7.615) and Doctor C (7.673) (p < 0.0001) while achieving the highest completeness score (4.096), outperforming Doctor C (3.846), Doctor A (2.923), and Doctor B (2.808) (p < 0.0001). AI, including GPT-4o, is currently not an acceptable standalone method for diagnosing glaucoma due to its lower accuracy compared to human clinicians. These findings suggest that GPT-4o could serve as a valuable adjunct in clinical practice, particularly in complex cases, but should not replace human expertise, especially for initial diagnoses. Future improvements in AI models could enhance their utility in ophthalmology.
第一作者机构:[1]Capital Med Univ, Beijing Tongren Hosp, Beijing Tongren Eye Ctr, Beijing Ophthalmol & Visual Sci Key Lab, Beijing 100730, Peoples R China
通讯作者:
推荐引用方式(GB/T 7714):
Zhang Junxiu,Ma Yao,Zhang Rong,et al.A comparative study of GPT-4o and human ophthalmologists in glaucoma diagnosis[J].SCIENTIFIC REPORTS.2024,14(1):doi:10.1038/s41598-024-80917-x.
APA:
Zhang, Junxiu,Ma, Yao,Zhang, Rong,Chen, Yanhua,Xu, Mengyao...&Ma, Ke.(2024).A comparative study of GPT-4o and human ophthalmologists in glaucoma diagnosis.SCIENTIFIC REPORTS,14,(1)
MLA:
Zhang, Junxiu,et al."A comparative study of GPT-4o and human ophthalmologists in glaucoma diagnosis".SCIENTIFIC REPORTS 14..1(2024)