Evaluating Common Strategies for the Efficiency of Feature Selection in the Context of Microarray Analysis

Evaluating Common Strategies for the Efficiency of Feature Selection in the Context of Microarray Analysis

下载PDF

导出

摘要 The recent explosion of high-throughput technology has been accompanied by a corresponding rapid increase in the number of new statistical methods for developing prognostic and predictive signatures. Three commonly used feature selection techniques for time-to-event data: single gene testing (SGT), Elastic net and the Maximizing R Square Algorithm (MARSA) are evaluated on simulated datasets that vary in the sample size, the number of features and the correlation between features. The results of each method are summarized by reporting the sensitivity and the Area Under the Receiver Operating Characteristic Curve (AUC). The performance of each of these algorithms depends heavily on the sample size while the number of features entered in the analysis has a much more modest impact. The coefficients estimated utilizing SGT are biased towards the null when the genes are uncorrelated and away from the null when the genes are correlated. The Elastic Net algorithms perform better than MARSA and almost as well as the SGT when the features are correlated and about the same as MARSA when the features are uncorrelated. The recent explosion of high-throughput technology has been accompanied by a corresponding rapid increase in the number of new statistical methods for developing prognostic and predictive signatures. Three commonly used feature selection techniques for time-to-event data: single gene testing (SGT), Elastic net and the Maximizing R Square Algorithm (MARSA) are evaluated on simulated datasets that vary in the sample size, the number of features and the correlation between features. The results of each method are summarized by reporting the sensitivity and the Area Under the Receiver Operating Characteristic Curve (AUC). The performance of each of these algorithms depends heavily on the sample size while the number of features entered in the analysis has a much more modest impact. The coefficients estimated utilizing SGT are biased towards the null when the genes are uncorrelated and away from the null when the genes are correlated. The Elastic Net algorithms perform better than MARSA and almost as well as the SGT when the features are correlated and about the same as MARSA when the features are uncorrelated.

作者 Melania Pintilie Jenna Sykes

机构地区 Department of Biostatistics Department of Respirology

出处《Journal of Data Analysis and Information Processing》 2017年第1期11-32,共22页 数据分析和信息处理（英文）

关键词 ELASTIC Net MARSA LASSO FEATURE Selection Elastic Net MARSA LASSO Feature Selection

分类号 R73 [医药卫生—肿瘤]

引文网络
相关文献

1林子义,周杰,席兆华,魏理,宣建伟.乌司他丁应用于肝切除术围术期对患者术后恢复的经济性评价——基于离散事件仿真模型[J].中国医疗保险,2018(3):56-61. 被引量：2
2Sisu Li,Wanzhou Ye.A Generalized Elastic Net Regularization with Smoothed <i>l</i><sub>0</sub>Penalty[J].Advances in Pure Mathematics,2017,7(1):66-74.
3Ding-jiao CAI,Bo LU,Xing-wei TONG.Hypothesis Testing with Paired Partly Interval Censored Data[J].Acta Mathematicae Applicatae Sinica,2019,35(3):541-548.
4Lihong Huang,Jianling Bai,Hao Yu,Feng Chen.Sample size re-estimation without un-blinding for time-to-event outcomes in oncology clinical trials[J].The Journal of Biomedical Research,2018,32(1):23-29. 被引量：1
5André A. A. Williams.Ordinal Outcome Modeling: The Application of the Adaptive Moment Estimation Optimizer to the Elastic Net Penalized Stereotype Logit[J].Journal of Data Analysis and Information Processing,2019,7(1):14-27.
6Robert H Blackwell,William Gange,Alexander M Kandabarow,Matthew M Harkenrider,Gopal N Gupta,Marcus L Quek,Robert C Flanigan.Adjuvant radiotherapy for pathologically advanced prostate cancer improves biochemical recurrence free survival compared to salvage radiotherapy[J].World Journal of Clinical Urology,2016,5(1):45-52.
7Carmen Viada Gonzalez,Jean-Francois Dupuy,Martha Fors López,Patricia Lorenzo Luaces,Gisela González Marinello,Elia Neninger Vinagera,Beatriz García Verdecia,Tania Crombet-Ramos.CIMAvax^(█)EGF vaccine therapy for non-small cell lung cancer:A weighted log-rank tests-based evaluation[J].Modern Chemotherapy,2013,2(3):51-56.
8Yemane Hailu Fissuh,Tsegay Giday Woldu,Idriss Abdelmajid Idriss Ahmed,Abebe Zewdie Kebebe.A Simulation Study on Comparing General Class of Semiparametric Transformation Models for Survival Outcome with Time-Varying Coefficients and Covariates[J].Open Journal of Statistics,2019,9(2):169-180.
9Uday Kant Jha,Peter Bajorski,Ernest Fokoue,Justine Vanden Heuvel,Jan van Aardt,Grant Anderson.Dimensionality Reduction of High-Dimensional Highly Correlated Multivariate Grapevine Dataset[J].Open Journal of Statistics,2017,7(4):702-717.
10Ali Hammadi.Mathematical Optimization Modelling for Fast-Switched and Delay Minimized Scheduling for Intra-Cell Communication in an AWGR-Based PON Data Center[J].International Journal of Communications, Network and System Sciences,2017,10(2):13-29.

Journal of Data Analysis and Information Processing

2017年第1期

浏览历史

内容加载中请稍等...

Evaluating Common Strategies for the Efficiency of Feature Selection in the Context of Microarray Analysis

相关作者

相关机构

相关主题

浏览历史