高级检索
当前位置: 首页 > 详情页

Detection of medical text semantic similarity based on convolutional neural network

文献详情

资源类型:
WOS体系:
Pubmed体系:

收录情况: ◇ SCIE

机构: [1]Shanghai Jiao Tong Univ, Inst Image Commun & Networking, Shanghai, Peoples R China [2]Shanghai Jiao Tong Univ, Tongren Hosp, Shanghai, Peoples R China [3]Synyi Res, Shanghai, Peoples R China [4]Shanghai Jiao Tong Univ, APEX Data & Knowledge Management Lab, Shanghai, Peoples R China [5]Weill Cornell Med, Dept Healthcare Policy & Res, New York, NY USA [6]Shanghai Jiao Tong Univ, Shanghai, Peoples R China
出处:
ISSN:

关键词: Text similarity Convolutional neural network LIME Natural language processing

摘要:
Background Imaging examinations, such as ultrasonography, magnetic resonance imaging and computed tomography scans, play key roles in healthcare settings. To assess and improve the quality of imaging diagnosis, we need to manually find and compare the pre-existing reports of imaging and pathology examinations which contain overlapping exam body sites from electrical medical records (EMRs). The process of retrieving those reports is time-consuming. In this paper, we propose a convolutional neural network (CNN) based method which can better utilize semantic information contained in report texts to accelerate the retrieving process. Methods We included 16,354 imaging and pathology report-pairs from 1926 patients who admitted to Shanghai Tongren Hospital and had ultrasonic examinations between 1st May 2017 and 31st July 2017. We adapted the CNN model to calculate the similarities among the report-pairs to identify target report-pairs with overlapping body sites, and compared the performance with other six conventional models, including keyword mapping, latent semantic analysis (LSA), latent Dirichlet allocation (LDA), Doc2Vec, Siamese long short term memory (LSTM) and a model based on named entity recognition (NER). We also utilized graph embedding method to enhance the word representation by capturing the semantic relations information from medical ontologies. Additionally, we used LIME algorithm to identify which features (or words) are decisive for the prediction results and improved the model interpretability. Results Experiment results showed that our CNN model gained significant improvement compared to all other conventional models on area under the receiver operating characteristic (AUROC), precision, recall and F1-score in our test dataset. The AUROC of our CNN models gained approximately 3-7% improvement. The AUROC of CNN model with graph-embedding and ontology based medical concept vectors was 0.8% higher than the model with randomly initialized vectors and 1.5% higher than the one with pre-trained word vectors. Conclusion Our study demonstrates that CNN model with pre-trained medical concept vectors could accurately identify target report-pairs with overlapping body sites and potentially accelerate the retrieving process for imaging diagnosis quality measurement.

语种:
被引次数:
WOS:
PubmedID:
中科院(CAS)分区:
出版当年[2018]版:
大类 | 4 区 医学
小类 | 4 区 医学:信息
最新[2025]版:
大类 | 3 区 医学
小类 | 3 区 医学:信息
JCR分区:
出版当年[2017]版:
Q2 MEDICAL INFORMATICS
最新[2023]版:
Q2 MEDICAL INFORMATICS

影响因子: 最新[2023版] 最新五年平均 出版当年[2017版] 出版当年五年平均 出版前一年[2016版] 出版后一年[2018版]

第一作者:
第一作者机构: [1]Shanghai Jiao Tong Univ, Inst Image Commun & Networking, Shanghai, Peoples R China [2]Shanghai Jiao Tong Univ, Tongren Hosp, Shanghai, Peoples R China
通讯作者:
通讯机构: [3]Synyi Res, Shanghai, Peoples R China [6]Shanghai Jiao Tong Univ, Shanghai, Peoples R China
推荐引用方式(GB/T 7714):
APA:
MLA:

资源点击量:23459 今日访问量:6 总访问量:1282 更新日期:2025-04-01 建议使用谷歌、火狐浏览器 常见问题

版权所有©2020 首都医科大学附属北京同仁医院 技术支持:重庆聚合科技有限公司 地址:北京市东城区东交民巷1号(100730)