Auditory brainstem response (ABR) interpretation in clinical practice often relies on visual inspection by audiologists, which is prone to inter-practitioner variability. While deep learning (DL) algorithms have shown promise in objectifying ABR detection in controlled settings, their applicability to real-world clinical data is hindered by small datasets and insufficient heterogeneity. This study evaluates the generalizability of nine DL models for ABR detection using large, multicenter datasets. The primary dataset analyzed, Clinical Dataset I, comprises 128,123 labeled ABRs from 13,813 participants across a wide range of ages and hearing levels, and was divided into a training set (90%) and a held-out test set (10%). The models included convolutional neural networks (CNNs; AlexNet, VGG, ResNet), transformer-based architectures (Transformer, Patch Time Series Transformer [PatchTST], Differential Transformer, and Differential PatchTST), and hybrid CNN-transformer models (ResTransformer, ResPatchTST). Performance was assessed on the held-out test set and four external datasets (Clinical II, Southampton, PhysioNet, Mendeley) using accuracy and area under the receiver operating characteristic curve (AUC). ResPatchTST achieved the highest performance on the held-out test set (accuracy: 91.90%, AUC: 0.976). Transformer-based models, particularly PatchTST, showed superior generalization to external datasets, maintaining robust accuracy across diverse clinical settings. Additional experiments highlighted the critical role of dataset size and diversity in enhancing model robustness. We also observed that incorporating acquisition parameters and demographic features as auxiliary inputs yielded performance gains in cross-center generalization. These findings underscore the potential of DL models-especially transformer-based architectures-for accurate and generalizable ABR detection, and highlight the necessity of large, diverse datasets in developing clinically reliable systems.
基金:
Natural Science Foundation of Chongqing; Department of Otorhinolaryngology & Hearing and Speech Rehabilitation; First Affiliated Hospital of Chongqing Medical University
第一作者机构:[1]Chongqing Univ Posts & Telecommun, Sch Commun & Informat Engn, Chongqing, Peoples R China[2]Chongqing Univ Posts & Telecommun, Inst Adv Sci, Chongqing, Peoples R China
通讯作者:
通讯机构:[5]Capital Med Univ, Beijing Inst Otolaryngol, Beijing Tongren Hosp, Beijing 100005, Peoples R China[6]Ear Sci Inst Australia, Subiaco, Australia[7]Univ Western Australia, Med Sch, Crawley, Australia
推荐引用方式(GB/T 7714):
Liu Yin,Xiang Lingjie,Li Qiang,et al.Comparison of Deep Learning Models for Objective Auditory Brainstem Response Detection: A Multicenter Validation Study[J].TRENDS IN HEARING.2025,29:doi:10.1177/23312165251347773.
APA:
Liu, Yin,Xiang, Lingjie,Li, Qiang,Li, Kangkang,Yang, Yihan...&Gao, Chenqiang.(2025).Comparison of Deep Learning Models for Objective Auditory Brainstem Response Detection: A Multicenter Validation Study.TRENDS IN HEARING,29,
MLA:
Liu, Yin,et al."Comparison of Deep Learning Models for Objective Auditory Brainstem Response Detection: A Multicenter Validation Study".TRENDS IN HEARING 29.(2025)