摘要
对肿瘤基因表达谱进行分析,从而有效区分正常样本与肿瘤样本的关键是:准确找出能够决定样本类别的最少特征基因,并用一个性能较好的分类器进行分类预测。针对该问题,用修订的特征记分准则(RFSC)去除分类无关基因;对两两冗余法进行改进,提出强相关树法用于冗余基因的去除;对粗糙支持向量机(RSVM)改进,提出近似等价粗糙支持向量机(AE-RSVM)对样本集进行分类测试。以肿瘤样本集为例进行测试,实验结果表明了提出方法的可行性和有效性。
The key of distinguishing between normal and tumor samples effectively for tumor gene expression data is to find out the fewest genes which can predict the classes, then use a good performance classifier to classify. Faced with the problem, it uses the Revised Feature Score Criterion(RFSC) to remove the genes irrelevant to the classification task. It improves the pair-wise redundancy method, proposes strong correlative tree to filter the redundant gene. It improves the Rough Support Vector Machine (RSVM) and proposes the Approximate Equivalence Rough Support Vector Machine(AE-RSVM), and then validates classifica- tion for data sets. Using the tumor data set to test, the experimental results show the feasibility and effectiveness of the method proposed in this paper.
出处
《计算机工程与应用》
CSCD
2013年第17期245-249,共5页
Computer Engineering and Applications
基金
中国科学院自动化研究所复杂系统与智能科学重点实验室开放课题基金(No.20070101)
辽宁省教育厅高等学校科学研究基金(No.2008344)
关键词
基因表达谱
肿瘤分类
基因选择
支持向量机
等价类
gene expression profile
tumor classification
gene selection
support vector machine
equivalence class