期刊文献+

基于稀疏表示算法的蛋白质质谱数据特征选择 被引量:2

Sparse Representation Based Feature Selection for Mass Spectrometry Data
原文传递
导出
摘要 高维、小样本数据的特征选择方法在蛋白质质谱数据处理分析领域有着广泛应用。本文针对蛋白质质谱特征选择问题,结合稀疏表示这一新理论框架,提出了一种基于稀疏表示的特征选择算法(sparse representation based feature selection,SRFS)。该方法将稀疏表示分类的结果作为评定某一个特征子空间特征相对重要性的度量,然后通过对大量随机采样子空间计算结果的统计,得到特征空间中每个特征的排序,并进一步分析提炼出与肿瘤疾病相关的若干谱峰。通过在卵巢癌公共数据集OC-WCX2a和浙江省肿瘤医院乳腺癌数据集BC-WCX2a上的实验结果表明,SRFS算法可以有效应用于本文所使用的SELDI-TOF蛋白质质谱数据的分析。 Feature selection method has been widely used for protein spectrometry data which has high dimension and small samples size. In this paper, a novel feature selection method based on sparse representation (SRFS) is proposed. SRFS considers a feature be important or informative if the subset containing it can perform well in a sparse representation classifier (SRC). In this method, the relative importance of a subset was measured via SRC. And by means of the results of abundant random subsets, we ranked all the features. We also extracted a few peaks which were related with cancer closely. To investigate the performance, the proposed method was tested and evaluated on the ovarian cancer database OC-WCX2a and breast cancer database BC-WCX2a which supplied by Zhejiang Cancer Hospital. The experimental results show that SRFS can be used to select highly predictive representative feature sets in SELDI-TOF protein spectrometry data analysed in this paper.
出处 《生物物理学报》 CAS CSCD 北大核心 2012年第8期683-691,共9页 Acta Biophysica Sinica
基金 国家自然科学基金项目(60801054 60801055) 国家杰出青年科学基金项目(60788101) 浙江省公益性技术应用研究项目(2010C33017) 浙江省医药卫生科学研究基金项目(2010KYA041) 浙江省省级科技项目(Y2080586)~~
关键词 蛋白质质谱 稀疏表示 特征选择 Protein mass spectrum Sparse representation Feature selection
  • 相关文献

参考文献13

  • 1Li J, Zhang Z, Rosenzweig J, Wang YY, Chan DW. Proteomics and bioinformatics approaches for identification of serum biomarkers to detect breast cancer. Clin Chem, 2002, 48(8): 1296-1304.
  • 2Poon TCW, Yip TT, Chan ATC, Yip C, Yip V, Mok TSK, Lee CC, Leung TW, Ho SK, Johnson PJ. Comprehensive proteomic profiling identifies serum proteomic signatures for detection of hepatocellular carcinoma and its subtypes. Clinl Chem, 2003, 49(5): 752-560.
  • 3Zhukov TA, Johanson RA, Cantor AB, Clark RA, Tockman MS. Discovery of distinct protein profiles specific for lung tumors and pre-malignant lung lesions by SELDI mass spectrometry* 1. Lung Cancer, 2003, 40(3): 267-279.
  • 4Tang KL, Li TH, Xiong WW, Chen K. Ovarian cancer classification based on dimensionality reduction for SELDI-TOF data. BMC Bioinformatics, 2010, 11(1): 1-8.
  • 5Larkin SET, Zeidan B, Taylor MG, Bickers B, AI-Ruwaili J, Aukim-Hastie C, Townsend PA. Proteomics in prostate cancer biomarker discovery. Expert Rev Proteomics, 2010, 7(1): 93-102.
  • 6Saeys Y, Inza I, Larraaga P. A review of feature selection techniques in bioinformatics. Bioinformatics, 2007, 23(19): 2507-2517.
  • 7Kirby M. Geometric data analysis: An empirical approach to dimensionality reduction and the study of patterns. New York, NY, USA: John Wiley & Sons, Inc. 2000.
  • 8Vannucci M, Sha N, Brown PJ. NIR and mass spectra classification: Bayesian methods for wavelet-based feature selection. Chemometrics Int Lab Systems, 2005, 77(1-2): 139-148.
  • 9Huang K, Aviyente S. Sparse representation for signal classification. Adv Neural Inform Proc Systems, 2007, 19: 609.
  • 10Coombes KR, Tsavachidis S, Morris JS, Baggerly KA, Kuerer HM. Improved peak detection and quantification of mass spectrometry data acquired from surface-enhanced laser desorption and ionization by denoising spectra with the undecimated discrete wavelet transform. Proteomics, 2005, 5(16): 4107-4117.

同被引文献22

  • 1朱启兵,杨宝,黄敏.基于核映射稀疏表示分类的轴承故障诊断[J].振动与冲击,2013,32(11):30-34. 被引量:9
  • 2张楠.低秩鉴别分析与回归分类方法研究[D].南京:南京理工大学,2012.
  • 3苏雅茹,王儒敬,成鹏.高维数据的维数约简算法研究[D].中国科技大学,2012.
  • 4张德丰.MATLAB小波分析(第二版)[M].北京:机械工业出版社.2011.
  • 5Li Chunguang, Qi Xianbiao, Guo Jun. Dimensionality Reduction by Low - Rank Embedding [ C ]//In- telligent Science and Intelligent Data Engineering. Springer Berlin Heidelberg,2013 : 181 - 188.
  • 6Liu Guangcan, Lin Zhouchen, Yu Yong. Robust Sub- space Segmentation by Low- Rank Representation [ C ]//Proceedings of the 27th International Conference on Machine Learning ( ICML - 10), 2010 : 663 - 670.
  • 7Liu Guangcan, Yan Shuicheng. Latent Low - Rank Rep- resentation for Subspace Segmentation and Feature Ex- traction[ C]// IEEE International Conference on Com- puter Vision, Barcelona: IEEE,2011 : 1 615 - 1 622.
  • 8Zhang N, Yang I. Low - Rank Representation Based Discriminative Projection for Robust Feature ExtractionEJ]. Neurocomputing, 2013,111:13 -20.
  • 9LIU Yihui. Feature extraction and dimensionality reduc- tion for mass spectrometry data[ J]. Computers in Biology and Medicine, 2009, 39: 818-823.
  • 10BEHDAD M, FRENCH T, BARONE L, et al. On princi- pal component analysis for high-dimensional XCSR [ J ]. Evolutionary Intelligence, 2012, 5 (2) : 129-138.

引证文献2

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部