期刊文献+

基于Lasso-RFE的乳腺癌预后仿真

Breast Cancer Prognosis Simulation Based on Lasso-RFE
下载PDF
导出
摘要 针对具有高维、小样本且比例不平衡的乳腺癌基因表达数据无法直接进行分类预测的问题,建立基于Lasso的递归特征消除与支持向量机的预后模型。首先对特征基因进行差异性分析,去除无显著差异特征基因。其次,对特征基因进行双采样,改善小类样本造成的算法敏感性较差的问题。同时,使用基于Lasso的递归特征消除的改进算法,减少Lasso可调参数改变造成的误差,实现对特征基因的稳定选择与逐步减少。最后,对完成特征提取后的37个特征基因使用支持向量机实现乳腺癌分类预后。与其它模型相比,本模型准确性特异性得到有效提高,可实现较为准确的预后预测。 Aiming at the problem that breast cancer gene expression data with high dimensions, small samples and unbalanced proportions cannot be directly classified and predicted, a prognostic model based on Lasso(Least Absolute Shrinkage and Selection Operator) recursive feature elimination and support vector machine is established. First, the difference of the characteristic genes was analyzed, and the characteristic genes without significant differences were removed. Secondly, the characteristic genes were double-sampled to improve the problem of poor algorithm sensitivity caused by small samples. At the same time, an improved algorithm based on Lasso’s recursive feature elimination was used to reduce the error caused by the change of Lasso’s adjustable parameters, and achieve stable selection and gradual reduction of feature genes. Finally, the support vector machine was used to realize the classification and prognosis of breast cancer for the 37 feature genes after the feature extraction. Compared with other models, the accuracy and specificity of this model are effectively improved, and it can achieve more accurate prognosis prediction.
作者 刘嘉欣 王宏伟 王佳 LIU Jia-xin;WANG Hong-wei;WANG Jia(Xinjiang University,School of Electrical Engineering,Urumqi Xinjiang 830000,China;Dalian Medical University,School of Basic Medicine,Dalian Liaoning 110041,China;Amy Hanxin Vaccine(Dalian)Co,Ltd.,Dalian Liaoning 116100,China)
出处 《计算机仿真》 北大核心 2022年第12期330-335,共6页 Computer Simulation
基金 国家自然科学基金(61863034)。
关键词 套索算法 乳腺癌 基因表达数据 预后预测 特征消除 Breast cancer Lasso Gene expression data Prognosis prediction Feature elimination
  • 相关文献

参考文献7

二级参考文献160

  • 1蒋盛益,谢照青,余雯.基于代价敏感的朴素贝叶斯不平衡数据分类研究[J].计算机研究与发展,2011,48(S1):387-390. 被引量:21
  • 2李颖新,阮晓钢.基于基因表达谱的肿瘤亚型识别与分类特征基因选取研究[J].电子学报,2005,33(4):651-655. 被引量:18
  • 3陈仕珠.乙型肝炎病毒疫苗和乙肝免疫[J].世界华人消化杂志,2006,14(27):2661-2667. 被引量:15
  • 4Tibshirani R. Regression shrinkage and selection via the Lasso. Journal of the Royal Statistical Society: Series B (Methodological), 1996, 58(1): 267-288.
  • 5Breiman L. Better subset regression using the nonnegative garrote. Technometrics, 1995, 37(4) 373-384.
  • 6Frank L L E, Friedman J H. A statistical view of some chemometrics regression tools. Technometrics, 1993, 35 (2) 109-135.
  • 7Efron B, Hastie T, Johnstone I, et al. Least angle regression. The Annals of Statistics, 2004, 32(2): 407-499.
  • 8Yuan M, Lin Y. On the non-negative garrotte estimator. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 2007, 69(2) : 143-161.
  • 9Xiong S. Some notes on the nonnegative garrote. Techno- metrics, 2010, 52(3): 349-361.
  • 10Fu W J. Penalized regressions: The bridge versus the Lasso. Journal of Computational and Graphical Statistics, 1998, 7(3) : 397-416.

共引文献285

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部