基于双模态乳腺影像特征的HER2表达状态多分类机器学习预测模型构建及可解释性分析研究

Multi⁃Class Machine Learning Predictive Model for HER2 Expression Status based on Dual⁃Modal Breast Imaging Features

导出

摘要目的基于乳腺X线及超声双模态影像特征构建多分类机器学习模型,术前预测乳腺癌患者的人类表皮生长因子受体2(HER2)表达状态。方法搜集符合纳排标准的632例女性乳腺癌患者,其中HER2不表达141例、低表达311例、高表达180例。提取研究对象全视野数字化乳腺X线摄影(FFDM)、数字乳腺断层摄影(DBT)以及乳腺超声(US)图像上乳腺癌病灶的影像学征象。在完成数据预处理后,基于五种机器学习算法构建乳腺癌HER2表达状态的三分类预测模型。采用五折交叉验证训练机器学习模型,计算三分类模型的总体准确率、精确率、查全率、F1⁃score、受试者工作特征曲线(ROC)曲线下面积(AUC)值。通过Bootstrap法比较模型间AUC值,筛选出分类性能最优的三分类模型。采用SHAP方法评估每个特征对模型预测HER2表达状态的重要性。结果在测试集上,随机森林(RF)模型对HER2表达状态的三分类性能最优,宏平均AUC为0.723,微平均AUC为0.783,总体准确率为57.4%,宏平均召回率为53.0%。SHAP分析结果表明,影响RF模型输出最重要的五个全局特征依次为X线观察到的钙化、舒张压、细线样或线样分支状钙化、US测量的病灶最大径、CA153。结论基于乳腺X线及超声双模态影像特征的机器学习多分类模型可以实现乳腺癌患者HER2表达状态的术前预测,结合SHAP方法可以提高机器学习模型的可解释性。 Objective The aim of this study is to construct multi⁃class machine learning models for preoperatively predicting the HER2 expression status of breast cancer patients using mammography and ultrasound dual⁃modal imaging features.Methods A total of 632 female breast cancer patients meeting the inclusion criteria were collected,including 141 cases with non⁃expression of HER2,311 cases with low expression,and 180 cases with high expression.Extract imaging features of breast cancer lesions from FFDM,DBT,and US images of the study subjects.After completing data preprocess,a three⁃class prediction model for the HER2 expression status in breast cancer was constructed based on five machine learning algorithms.The machine learning models were trained using five⁃fold cross⁃validation,and the overall accuracy,precision,recall,F1⁃score,and the AUC were calculated for the three⁃class models.The Bootstrap method was used to compare the AUCs between models in order to select the optimal three⁃class classification model with the best performance.The SHAP method was employed to assess the importance of each feature in predicting the HER2 expression status.Results In the test set,the RF model demonstrated the best performance in the three⁃class classification of HER2 expression status.The macro⁃average AUC was 0.723,the micro⁃average AUC was 0.783,and the overall accuracy was 57.4%.The macro⁃average recall rate was 53.0%.SHAP analysis revealed that the five most important global features influencing theoutput of the RF model were,in descending order,calcifications observed on X⁃ray,diastolic blood pressure,linear or branching calcifications,maximum lesion diameter measured by US,and CA153.Conclusion The machine learning multi⁃class prediction model based onmammographic and ultrasound dual⁃modal imaging features has the ability to predict the HER2 status of breast cancer patients before operation.Combining with SHAP values can enhance the interpretability of the machine learning model.

作者舒予静何子龙曾伟雄刘家玲郭丫丫陈卫国 SHU Yujing;HE Zilong;ZENG Weixiong(Department of Radiology,Nanfang Hospital,Southern Medical University,Guangzhou,Guangdong Province 510515,P.R.China)

机构地区南方医科大学南方医院放射科广东医科大学附属东莞第一医院放射影像中心

出处《临床放射学杂志》北大核心 2024年第8期1310-1316,共7页 Journal of Clinical Radiology

基金国家自然科学基金项目(编号:82171929)。

关键词乳腺癌 HER2 表达状态乳腺X 线摄影超声机器学习 Breast Cancer HER2 Status Mammography Ultrasound Machine Learning

分类号 R737.9 [医药卫生—肿瘤]

引文网络
相关文献

1何飞,黄忠江,武沛增,郭晓芬,王雷.全视野数字化乳腺X线摄影影像组学及深度学习特征预测乳腺癌HER-2状态[J].分子影像学杂志,2024,47(8):804-810.
2黄静,马彦云,武静,李彪.基于MRI瘤内联合最佳瘤周影像组学评估乳腺癌Ki⁃67表达状态的研究[J].临床放射学杂志,2024,43(8):1317-1324.
3高小盼,赵佳佳,彭慧斌.甲状腺癌诊断中超声检查相关影像学征象与患者淋巴结转移的关系[J].临床医学工程,2024,31(8):917-918.
4翟艳伟.乳腺超声及乳腺X线在诊断乳腺癌中的应用效果评价[J].影像研究与医学应用,2024,8(17):140-142.
5姚雷,崔碧,吴振虎,王晋君.乳腺MRI特征与乳腺癌P63表达的关系研究[J].临床放射学杂志,2024,43(8):1282-1285.
6王昭华,马彦云,崔曹哲,宋瑞,张鹏丽,黄静,武静.探究不同乳腺实质区域的全视野数字化乳腺摄影影像组学对BI-RADS 4类病变鉴别的价值[J].临床放射学杂志,2024,43(3):346-351.
7周志勇,尹雅晴,刘金苍,牛雪皎,常毓轩.基于深度学习的白内障诊断系统研究[J].中国新技术新产品,2024(15):5-8.
8蒋三艳.血清CRP、PCT、IL-6在医院获得性肺炎患者早期诊断中的临床价值[J].中国现代药物应用,2024,18(16):62-65.
9潘婷,邓雪英,范林音,张诗晗,李雅怡,邵国良.单体位联合双体位DBT与FFDM对乳腺癌的诊断价值[J].医学影像学杂志,2024,34(4):53-57.
10罗飞,罗宁,李晓宇,贾红燕.F框/WD-40域蛋白7在乳腺癌预后预测中的价值及功能研究[J].中华实验外科杂志,2024,41(6):1265-1269.

临床放射学杂志

2024年第8期

浏览历史

内容加载中请稍等...

基于双模态乳腺影像特征的HER2表达状态多分类机器学习预测模型构建及可解释性分析研究

相关作者

相关机构

相关主题

浏览历史