摘要
目的构建一种集成特征降维技术和人工神经网络分类器的机器学习诊断模型,开发临床常规血液指标对卵巢癌的辅助诊断价值。方法收集本院明确诊断的卵巢癌患者作为病例组(n=185),将其他恶性妇科肿瘤患者(n=138)、良性妇科疾病患者(n=339)与正常体检者(n=92)三类人群作为整体对照组。借助电子病历挖掘系统获取人口学资料以及肿瘤标志物、血细胞分析、性激素等6类共计28项实验室检测指标。通过主成分分析对检验数据进行特征提取,再将低维度特征集作为神经网络的输入层变量建立诊断模型,同时采用遗传算法优化神经网络的参数以提高模型的训练速度和分类精度。结果机器学习诊断模型的ROC曲线下面积达到0.948,敏感性为91.9%,特异性为86.9%,其诊断效能明显优于单项检测CA125的传统方式。该模型对不同病程分期的卵巢癌均能进行准确诊断,并且在三个对照亚组中均表现出对卵巢癌一定的鉴别能力。结论利用机器学习整合多项常规检验指标可有效提升卵巢癌的诊断效能,为卵巢癌的智能化辅助诊断提供了新思路。
Objective To construct a diagnostic model based on machine learning algorithm consisting of characteristic reduction and artificial neural network classification,so as to develop the auxiliary diagnostic value of routine hematological parameters for ovarian cancer. Methods We selected 185 patients with ovarian cancer as the case group,and integrated 3 control groups as a single whole control group,which was comprised of 138 patients with other malignant gynecological tumors and 339 patients with benign gynecological diseases,as well as 92 healthy volunteers. A total of 28 tested indexes consisting of 6 categories,such as tumor markers,blood cell analysis and sex hormones,were obtained through the electronic medical record mining system. We adopted the analysis of principal components to extract the characteristics,and then established the diagnosis model using the low-dimensional feature set as input layer variable of the neural network accompanied by optimization of genetic algorithm aiming to improve the training speed and classification accuracy. Results The area under the ROC curve of the diagnosis model based on machine learning was 0.948. The sensitivity and specificity were 91.9% and 86.9%,respectively. The diagnostic efficiency of the model was significantly superior to that of the traditional CA125 detection. Further validation indicated the model had high accuracy in discriminating different clinical stages of ovarian cancer from the whole control group. The diagnose performance of the model was independent of any subgroups of the 3 controls. Conclusion Machine learning can substantially improve the diagnostic efficiency of ovarian cancer through data consolidation of existing routine laboratory indexes,which provides a novel approach to intelligent auxiliary diagnosis for ovarian cancer.
作者
张桐硕
任鹤菲
曹瑾
崔云涛
刘金龙
陈晓
吴龙涛
李艳秋
ZHANG Tongshuo;REN Hefei;CAO Jin;CUI Yuntao;LIU Jinlong;CHEN Xiao;WU Longtao;LI Yanqiu(Logistics University of PAP,Tianjin 300309;Department of Clinical Laboratory,Special Medical Center of PAP,Tianjin 300152;Department of Gynaecology and Obstetrics,Special Medical Center of PAP,Tianjin 300152;Shanghai Le9 Healthcare Technology Co.,Ltd,Shanghai 200233,China)
出处
《临床检验杂志》
CAS
CSCD
2018年第12期908-913,共6页
Chinese Journal of Clinical Laboratory Science
关键词
卵巢癌
机器学习
检验指标
诊断
ovarian cancer
machine learning
laboratory index
diagnosis