Completing continuous circular capsulorhexis (CCC) requires the operator to perform fine operations, which is difficult to do accurately when continuous fine actions are out of balance in the classification of CCC procedures. Multimodal deep learning can improve the classifier's performance, but the recognition accuracy of inferior classes is difficult to improve. To solve these problems, a bidirect-gate recurrent unit (Bi-GRU)-attention-based multimodal, multi-timescale data fusion network (BiMNet) is proposed, which contains a data extraction module called a skip-concatenate gate recurrent unit (SC-GRU), a bimodal data fusion attention computation, and a decoder module. The combination of these modules can fully extract the features of different temporal scales in multimodal action data and fuse them effectively. The model is validated using the ophthalmologist CCC multimodal maneuver dataset, which was collected by the data collection platform constructed in this research, achieving an accuracy of 0.9124 +/- 0.0125 in continuous action sequence segmentation and improving the F1-score of minority class recognition to over 80%, making it more effective than baseline algorithms.
基金:
National Natural Science Foun-dation of China [62027813, U20A20196]; National Key Re-search and Development Program of China [2022YFB4702900]; Beijing Science Fund for Distinguished Young Scholars, China [JQ21016]; Excellent member of CAS Youth Innovation Promotion Association, China [Y2022054]
第一作者机构:[1]Beijing Informat Sci & Technol Univ, Sch Automat, Beijing 100096, Peoples R China[2]Chinese Acad Sci, Inst Automat, Beijing 100190, Peoples R China[*1]School of Automation, Beijing Information Science and Technology University, Beijing, 100096, China
通讯作者:
通讯机构:[1]Beijing Informat Sci & Technol Univ, Sch Automat, Beijing 100096, Peoples R China[2]Chinese Acad Sci, Inst Automat, Beijing 100190, Peoples R China[*1]School of Automation, Beijing Information Science and Technology University, Beijing, 100096, China
推荐引用方式(GB/T 7714):
Bian Gui-Bin,Zheng Jia-Ying,Li Zhen,et al.BiMNet: A Multimodal Data Fusion Network for continuous circular capsulorhexis Action Segmentation[J].EXPERT SYSTEMS WITH APPLICATIONS.2024,238:doi:10.1016/j.eswa.2023.121885.
APA:
Bian, Gui-Bin,Zheng, Jia-Ying,Li, Zhen,Wang, Jie,Fu, Pan...&De Albuquerque, Victor Hugo C..(2024).BiMNet: A Multimodal Data Fusion Network for continuous circular capsulorhexis Action Segmentation.EXPERT SYSTEMS WITH APPLICATIONS,238,
MLA:
Bian, Gui-Bin,et al."BiMNet: A Multimodal Data Fusion Network for continuous circular capsulorhexis Action Segmentation".EXPERT SYSTEMS WITH APPLICATIONS 238.(2024)