期刊文献+

改进的随机森林算法在乳腺肿瘤诊断中的应用 被引量:5

APPLICATION OF IMPROVED RANDOM FOREST ALGORITHM IN BREAST TUMOUR DIAGNOSIS
下载PDF
导出
摘要 为了解决乳腺肿瘤诊断中误差代价敏感的不平衡分类问题,提出一种改进的随机森林算法的乳腺肿瘤诊断模型。首先,在随机森林算法的基础上,将良恶乳腺肿瘤样本的诊断性能分开考虑,利用随机森林的泛化误差上界相关因素推导出ROC曲线的查全率(TPR)和误警率(FPR)的上界值。给出针对特定类别优化分类性能的基准,绘制出不同决策阈值下的TPR和FPR值的ROC曲线,调整平均关联度,再次训练,依据ROC曲线性能,确定最优平均关联度的诊断模型。最后,将该改进的随机森林算法与传统方法的诊断性能进行对比。实验结果证明,提出的方法模型在保证整体的诊断性能的前提下,对于提高恶性肿瘤的识别能力具有可行性和有效性。 To solve the problem of cost-sensitive imbalanced classification in breast tumour diagnosis,the paper proposes a breast tumour diagnosis model using the improved random forest algorithm. First,on the basis of random forest algorithm,we separately dealt with the diagnosis performances of benign and malignant breast tumour samples,made use of the corresponding factor of upper bound of random forests generalisation errors to deduce the upper bounds of recall rate( or TPR) and false alarm rate( or FPR) of ROC curve,then we gave the benchmark of optimising classification performance for specific categories,and drew the ROC curves with TPR and FPR values gained in different decision thresholds. After that we adjusted the average correlation and train the model again,and according to ROC curve performance we determined the diagnosis model with optimal average correlation. Finally,we compared the improved random forest algorithm with traditional methods in terms of diagnosis performance. Experimental results showed that the proposed model has the feasibility and effectiveness in improving the recognition ability of malignant tumour while keeping up with the overall diagnostic accuracy.
作者 王平 单文英
出处 《计算机应用与软件》 CSCD 2016年第4期252-257,264,共7页 Computer Applications and Software
基金 江西省教育厅2014年度科学技术研究项目(GJJ14137)
关键词 乳腺肿瘤 诊断 代价敏感 不平衡分类 随机森林 ROC曲线 Breast tumour Diagnosis Cost-sensitive Imbalanced classification Random forest ROC curve
  • 相关文献

参考文献7

二级参考文献79

共引文献751

同被引文献58

引证文献5

二级引证文献22

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部