高级检索
当前位置: 首页 > 详情页

A machine learning-based framework to identify type 2 diabetes through electronic health records

文献详情

资源类型:
WOS体系:
Pubmed体系:

收录情况: ◇ SCIE ◇ SSCI ◇ EI

机构: [1]Institute of Image Communication and Networking, Shanghai Jiao Tong University, Shanghai, China [2]Tongren Hospital Shanghai Jiao Tong University, Shanghai, China [3]Department of Electrical Engineering & Computer Science, Vanderbilt University, Nashville, TN, USA [4]Department of Endocrinology, the First Affiliated Hospital of Sun Yat-Sen University, Guangzhou, China [5]Division of Epidemiology, Vanderbilt University, Nashville, TN, USAfDepartment of Biomedical Informatics, Vanderbilt University, Nashville, TN, USA
出处:
ISSN:

关键词: Electronic health records Type 2 diabetes Data mining Feature engineering Machine learning

摘要:
Objective: To discover diverse genotype-phenotype associations affiliated with Type 2 Diabetes Mellitus (T2DM) via genome-wide association study (GWAS) and phenome-wide association study (PheWAS), more cases (T2DM subjects) and controls (subjects without T2DM) are required to be identified (e.g., via Electronic Health Records (EHR)). However, existing expert based identification algorithms often suffer in a low recall rate and could miss a large number of valuable samples under conservative filtering standards. The goal of this work is to develop a semi-automated framework based on machine learning as a pilot study to liberalize filtering criteria to improve recall rate with a keeping of low false positive rate. Materials and Methods: We propose a data informed framework for identifying subjects with and without T2DM from EHR via feature engineering and machine learning. We evaluate and contrast the identification performance of widely-used machine learning models within our framework, including k-Nearest-Neighbors, Naive Bayes, Decision Tree, Random Forest, Support Vector Machine and Logistic Regression. Our framework was conducted on 300 patient samples (161 cases, 60 controls and 79 unconfirmed subjects), randomly selected from 23,281 diabetes related cohort retrieved from a regional distributed EHR repository ranging from 2012 to 2014. Results: We apply top-performing machine learning algorithms on the engineered features. We benchmark and contrast the accuracy, precision, AUC, sensitivity and specificity of classification models against the state-of-the-art expert algorithm for identification of T2DM subjects. Our results indicate that the framework achieved high identification performances (similar to 0.98 in average AUC), which are much higher than the state-of-the-art algorithm (0.71 in AUC). Discussion: Expert algorithm-based identification of T2DM subjects from EHR is often hampered by the high missing rates due to their conservative selection criteria. Our framework leverages machine learning and feature engineering to loosen such selection criteria to achieve a high identification rate of cases and controls. Conclusions: Our proposed framework demonstrates a more accurate and efficient approach for identifying subjects with and without T2DM from EHR. (C) 2016 Elsevier Ireland Ltd. All rights reserved.

基金:
语种:
被引次数:
WOS:
PubmedID:
中科院(CAS)分区:
出版当年[2016]版:
大类 | 3 区 医学
小类 | 2 区 计算机:信息系统 3 区 卫生保健与服务 3 区 医学:信息
最新[2023]版:
大类 | 2 区 医学
小类 | 2 区 计算机:信息系统 2 区 卫生保健与服务 3 区 医学:信息
JCR分区:
出版当年[2015]版:
Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Q2 HEALTH CARE SCIENCES & SERVICES Q2 MEDICAL INFORMATICS
最新[2023]版:
Q1 HEALTH CARE SCIENCES & SERVICES Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Q2 MEDICAL INFORMATICS

影响因子: 最新[2023版] 最新五年平均 出版当年[2015版] 出版当年五年平均 出版前一年[2014版] 出版后一年[2016版]

第一作者:
第一作者机构: [1]Institute of Image Communication and Networking, Shanghai Jiao Tong University, Shanghai, China [2]Tongren Hospital Shanghai Jiao Tong University, Shanghai, China
通讯作者:
通讯机构: [*1]2525 West End Ave, Suite 1475, Department of BiomedicalInformatics, Vanderbilt University, Nashville, TN 37203 USA.
推荐引用方式(GB/T 7714):
APA:
MLA:

资源点击量:21169 今日访问量:0 总访问量:1219 更新日期:2025-01-01 建议使用谷歌、火狐浏览器 常见问题

版权所有©2020 首都医科大学附属北京同仁医院 技术支持:重庆聚合科技有限公司 地址:北京市东城区东交民巷1号(100730)