摘要
基于肿瘤基因表达谱的肿瘤分类是生物信息学的一个重要研究内容。传统的肿瘤信息特征提取方法大多基于信息基因选择方法,但是在筛选基因时,不可避免的会造成分类信息的流失。提出了一种基于邻接矩阵分解的肿瘤亚型特征提取方法,首先对肿瘤基因表达谱数据构造高斯权邻接矩阵,接着对邻接矩阵进行奇异值分解,最后将分解得到的正交矩阵特征行向量作为分类特征输入支持向量机进行分类识别。采用留一法对白血病两个亚型的基因表达谱数据集进行实验,实验结果证明了该方法的可行性和有效性。
Tumor classification based on tumor gene expression is an important part of bioinformatics.The traditional tumor identification is mostly based on genetic selection method,but,the loss of classified information is inevitable when filter genes.This paper proposes a comprehensive feature extraction method,based on adjacency matrix decomposition,by means of constructing the Gauss adjacency matrix of gene expression data and Singular value decomposition on adjacency matrix,and putting the orthogonal matrix vector as the classification features into SVM to recognition.Leave one out method was used and two subtypes of leukemia gene expression data were taken for example.The feasibility and effectiveness of this algorithm were well proven.
出处
《生物学杂志》
CAS
CSCD
2011年第2期87-89,共3页
Journal of Biology
基金
国家自然科学基金(10601001
60772121)
安徽省自然科学基金(070412065)
安徽省教育厅自然科学研究项目(2006KJ030B)
关键词
生物信息学
邻接矩阵
基因表达数据
特征提取
bioinformatics
adjacent spectral
gene expression data
feature extraction