期刊文献+

自动机器学习模型预测孕早期子痫前期风险的效果

Effectiveness of automated machine learning models in predicting the risk of preeclampsia in the first trimester
原文传递
导出
摘要 目的分析自动机器学习(autoML)模型预测孕早期子痫前期风险的效果。方法选取2017年1月—2020年10月2180例在济南市第二妇幼保健院建档并于孕12周进行孕检的单胎孕妇,根据整个孕期是否发生子痫前期分为子痫前期组(103例)和对照组(2077例),比较两组孕妇临床资料和血液学指标差异,分析各指标与子痫前期发生风险的相关性。将纳入研究的孕妇按7∶3的比例随机分为训练集和测试集,应用autogluon autoML算法构建多种机器学习模型,并在训练集中进行训练和交叉验证,比较不同模型的训练和验证准确率。分析各指标在autoML模型中的重要性,以autoML模型和logistic回归模型分别对测试集孕妇孕早期子痫前期的发生风险进行预测,应用受试者工作特征(ROC)曲线对autoML模型和logistic回归模型的预测效能进行评价。结果子痫前期组年龄、孕前体质指数、孕12周体质指数、孕12周腰围、饮酒史比例、超敏C-反应蛋白(hs-CRP)、三酰甘油、低密度脂蛋白胆固醇(LDL-C)、天冬氨酸氨基转移酶(AST)、血小板分布宽度(PDW)、平均血小板体积、促甲状腺激素(TSH)、β-人绒毛膜促性腺激素水平均显著高于对照组(均P<0.05),游离三碘甲状腺原氨酸(游离T3)、游离甲状腺素(游离T4)、胎盘生长因子(PIGF)、可溶性fms样酪氨酸激酶-1(sFlt-1)、妊娠相关血浆蛋白-A(PAPP-A)均显著低于对照组(均P<0.05)。相关性分析显示,孕前体质指数、孕12周体质指数、孕12周腰围、hs-CRP、三酰甘油、AST、TSH、游离T3、游离T4、β-HCG、PIGF、sFlt-1和PAPP-A等与孕早期子痫前期风险的相关性较高;但各指标间的相关性均较低。通过autoML模型算法共构建8类18个模型,基于FastAI的神经网络2在训练集(0.963)和验证集(0.971)中的准确率最高;TSH、LDL-C、PDW、孕12周腰围、sFlt-1、AST等指标重要性较高,游离T4、总胆固醇、孕次、饮酒史、产次和高血压家族史重要性较低。孕早期autoML模型预测子痫前期发生风险的ROC曲线下面积显著高于logistic回归模型(0.984比0.765,P=0.002);两种预测模型在训练集的预测准确率差异无统计学意义(P>0.05);autoML模型在测试集的预测准确率和灵敏度均显著高于logistic回归模型(99.54%比98.32%,93.75%比75.00%,均P<0.05)。结论孕早期TSH、LDL-C、PDW、孕12周腰围、sFlt-1、AST等因素与子痫前期发生风险具有一定相关性,基于孕早期指标的autoML模型对子痫前期发生风险具有较高的预测价值。 Objective To explore the application value of automated machine learning(autoML)model in predicting the risk of preeclampsia in the first trimester.Methods From January 2017 to October 2020,2180 singleton pregnant women who were registered in Jinan Second Maternal and Child Health Hospital and underwent pregnancy examination at 12 weeks of gestation were selected.The pregnant women were divided into preeclampsia group(103 cases)and control group(2077 cases)according to the occurrence of preeclampsia.The differences in clinical data and hematological indexes in the two groups were compared,and the correlation between each index and the risk of preeclampsia was analyzed too.All the pregnant women were randomly divided into training set and test set according to the ratio of 7∶3,and the autogluon autoML algorithm was used to build a variety of machine learning models,and training and cross-validation were performed in the training set to compare the accuracy of the different models.The importance of each index in the autoML model was analyzed,and the autoML model and the logistic regression model were used to predict the risk of preeclampsia in pregnant women in the test set respectively,and the receiver operating characteristic(ROC)curve was used to evaluate the prediction performance of the autoML and the logistic regression model.Results The age,pre-pregnancy body mass index,body mass index at 12 weeks of gestation,waist circumference at 12 weeks of gestation,proportion of drinking history,high-sensitivity C-reactive protein(hs-CRP),triglyceride,low-density lipoprotein cholesterol(LDL-C),aspartate aminotransferase(AST),platelet distribution width(PDW),mean platelet volume,thyroid stimulating hormone(TSH)andβ-human chorionic gonadotropin were all significantly higher than those in the control group(all P<0.05),and the free tri-iodothyronine(free T3),free thyroxine(free T4),placental growth factor(PIGF),soluble fms-like tyrosine kinase-1(sFlt-1)and pregnancy-associated plasma protein-A(PAPP-A)were all significantly lower than those in the control group(all P<0.05).Correlation analysis showed that the correlation between pre-pregnancy body mass index,body mass index at 12 weeks gestation,waist circumference at 12 weeks gestation,hs-CRP,triacylglycerol,AST,TSH,free T3,free T4,β-HCG,PIGF,sFlt-1,PAPP-A and preeclampsia risk were obviously higher;but the correlation between each index were lower.A total of 18 models in 8 categories were constructed with the autoML model algorithm,and the neural network_L2 based on FastAI had the highest accuracy in the training set(0.963)and the validation set(0.971).The TSH,LDL-C,PDW,waist circumference at 12 weeks of gestation,sFlt-1,AST were more important in the model,while the free T4,total cholesterol,pregnancy times,drinking history,parity and family history of hypertension were less important indicators.The area under the ROC curve of the autoML model for predicting the risk of preeclampsia in the first trimester was significantly higher than that of the logistic regression model(0.984 vs 0.765,P=0.002),while there was no statistical difference in the prediction accuracy of the two prediction models in the training set(P>0.05).The prediction accuracy and sensitivity of the autoML model in the test set were both significantly higher than those of the logistic regression model(99.54%vs 98.32%,93.75%vs 75.00%,both P<0.05).Conclusions Factors such as TSH,LDL-C,PDW,waist circumference,sFlt-1 and AST in the first trimester of pregnancy have a certain correlation with the risk of preeclampsia.The autoML model based on the indicators of the first trimester has a high predictive value for the risk of preeclampsia.
作者 陈红波 李红 赵春梅 谢少云 贾春美 Chen Hongbo;Li Hong;Zhao Chunmei;Xie Shaoyun;Jia Chunmei(Maternity School,Jinan Second Maternity and Child Health Hospital,Jinan 271100,China;Outpatient of Maternity Nutrition,Jinan Second Maternity and Child Health Hospital,Jinan 271100,China;Hospital Office,Jinan Second Maternal and Child Health Hospital,Jinan 271100,China;Health Education Department,Jinan Second Maternal and Child Health Hospital,Jinan 271100,China)
出处 《中华健康管理学杂志》 CAS CSCD 2022年第8期553-560,共8页 Chinese Journal of Health Management
关键词 先兆子痫 妊娠初期 机器学习 预测模型 筛查 Preeclampsia Pregnancy trimester,first Machine learning Prediction model Screening
  • 相关文献

参考文献3

二级参考文献61

共引文献1190

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部