期刊文献+

XGBoost算法在二分类非平衡高维数据分析中的应用 被引量:5

Application of XGBoost to the Analysis of Class-imbalanced High-dimensional Omics Data
下载PDF
导出
摘要 目的探讨XGBoost算法在二分类高维非平衡数据中的分类判别效果。方法通过模拟实验及真实代谢组学数据分析,对XGBoost、随机森林、支持向量机、随机欠采样以及随机梯度提升树共五种方法进行比较。结果模拟实验显示,XGBoost算法在数据非平衡较明显时,在各种实验条件下均优于或不劣于其他四种算法,在数据类别趋于平衡的情况下也同样具有较好的分类效果,且对噪声变量具有一定的抗干扰能力。实例分析显示,与其他四种算法相比,XGBoost算法的分类性能最优,且在保证分类效果的基础上具有更快的运算速度。结论 XGBoost算法适用于非平衡高维数据的判别分析,值得研究。 Objective To explore the performance of classification by XGBoostmodel in the case of Class-imbalanced High-dimensional Omics Data.Methods XGBoost was compared withRF,SVM,random under-samplingand SGBT by analysis of simulation experiments and actual metabolomics data.Results Simulation experiments showed that XGBoost is superior to the other four algorithms under various experimental conditions when the data is obviously class-imbalanced,it also has good classification effect when the data are nearly balanced,and has anti-interference ability to noise variables.Actual data showed that compared with the other four algorithms,XGBoost has the best classification performance and faster calculation speed on the basis of ensuring the classification effect.Conclusion XGBoost is suitable for discriminant analysis of class-imbalanced high dimensional omics data,and is worthwhile to further research.
作者 卢娅欣 黄月 李康 Lu Yaxin;Huang Yue;Li Kang(Department of Medical Statistics,Harbin Medical University(150081),Harbin)
出处 《中国卫生统计》 CSCD 北大核心 2021年第1期21-24,共4页 Chinese Journal of Health Statistics
基金 国家自然科学基金(81973149,81773551)。
关键词 极端梯度提升算法 高维组学数据 分类判别 XGBoost High dimensional omics data Classification
  • 相关文献

参考文献1

二级参考文献12

  • 1Hibi K,Goto T,Mizukami H,et al.Demethylation of the CDH3 gene is frequently detected in advanced colorectal cancer[J].Anticancer Res,2009,29(6):2215-2221.
  • 2Imai K,Hirata S,Irie A,et al.Identification of a novel tumor-associated antigen,cadherin 3/P-cadherin,as a possible target for immunotherapy of pancreatic,gastric,and colorectal cancers[J].Clin Cancer Res,2008,14(20):6487-6581.
  • 3Cheung LW,Leung PC,Wong AS.Cadherin switching and activation of p120 catenin signaling are mediators of gonadotropin-releasing hormone to promote tumor cell migration and invasion in ovarian cancer[J].Oncogene,2010,29(16):2427-2466.
  • 4Mlakar V,Berginc G,Volavsek M,et al.Presence of activating KRAS mutations correlates significantly with expression of tumour suppressor genes DCN and TPM1 in colorectal cancer[J].BMC Cancer,2009,9(1):282-290.
  • 5Tu LC,Yan X,Hood L,et al.Proteomics analysis of the interactome of N-myc downstream regulated gene 1 and its interactions with the androgen response program in prostate cancer cells[J].Mol Cell Proteomics,2007,6(4):575-662.
  • 6Jerome F,Trevor H,Robert T.Additive Logistic regression:a statistical view of boosting[J].The annals of Statistics,2000,28(2):337-374.
  • 7Valdehita A,Carmena MJ,Collado B,et al.Vasoactive intestinal peptide (VIP) increases vascular endothelial growth factor (VEGF) expression and secretion in human breast cancer cells[J].Regul Pept,2007,144(1-3):101-108.
  • 8Sastry KS,Smith AJ,Karpova Y,et al.Diverse antiapoptotic signaling pathways activated by vasoactive intestinal polypeptide,epidermal growth factor,and phosphatidylinositol 3-kinase in prostate cancer cells converge on BAD[J].J Biol Chem,2006,281(30):20891-21791.
  • 9Jiang W,Li X,Rao S,et al.Constructing disease-specific gene networks using pair-wise relevance metric:application to colon cancer identifies interleukin 8,desmin and enolase 1 as the central elements[J].BMC Syst Biol,2008,2(1):72-86.
  • 10Council L,Hameed O.Differential expression of immunohistochemical markers in bladder smooth muscle and myofibroblasts,and the potential utility of desmin,smoothelin,and vimentin in staging of bladder carcinoma[J].Mod Pathol,2009,22(5):639-688.

共引文献2

同被引文献23

引证文献5

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部