摘要
基因表达系列分析(Serial analysis of gene expression,SAGE)是一种基因表达数据,反映了细胞内的动态变化。模式识别和可视化方法是分析SAGE数据的基本工具,但是由于缺乏描述数据的统计特性,传统的聚类分析技术不适用于SAGE数据的分析。本文提出了一种基于多分类和支持向量机的SAGE数据的分析法。经过对模拟数据和人类癌症SAGE数据的分析,基于径向基核函数的多分类支持向量机算法"一对一"(one-against-one,OAO)算法提供了比PoissonC和PoissonS更好的分类结果。
Serial analysis of gene expression(SAGE) is a powerful technique for comprehensive gene-expression profiling.Pattern discovery and visualization have become fundamental approaches to analyzing SAGE data.However,SAGE data have been poorly exploited by clustering analysis owing to the lack of appropriate statistical methods that consider their specific properties.Traditional clustering techniques may not suitable for SAGE data analysis.Based on multi-class classification methods and support vector machine,this paper presents a novel clustering algorithm for SAGE data analysis.Tested on synthetic and experimental SAGE data,this algorithm demonstrates several advantages over traditional clustering algorithm.The results indicate that,one-against-one with radial-basis function network kernel offers significant advantages compared to PoissonC and PoissonS.
出处
《生物信息学》
2010年第4期356-358,363,共4页
Chinese Journal of Bioinformatics
基金
国家自然科学基金(60671061)
关键词
基因表达系列分析
多分类
支持向量机
核函数
Serial analysis of gene expression
multi-class classification
support vector machine
kernel function