DNA微阵列数据特征提取的分类方法研究被引量：1

Method of extracting features from DNA microarray data for classification

下载PDF

导出

摘要常用的排列方法从DNA微数据中选择的基因集合往往会包含相关性较高的基因,而且使用单个基因评价方法也不能真正反映由此得到的特征集合分类能力的优劣。另外,基因数量远多于样本数量是进行疾病诊断面临的又一挑战。为此,提出一种DNA微阵列数据特征提取方法用于组织分类。该方法运用K-means方法对基因进行聚类分析,获取各子类DNA微阵列数据中心,用排列法去除对分类无关的子类,然后利用ICA方法提取剩余子类集合的特征,用SVMs方法构造分类器对组织进行分类。真实的生物学数据实验表明,该方法通过提取一种复合基因,能综合评价基因分类能力,减少特征数,提高分类器的分类准确性。 Gene sets of interest typically selected by usual ranking methods from DNA microarray data will contain many highly correlated genes,and using the evaluating method of single gene does not reflect really the capacity of classifier of character sets.And disease diagnostics based on gene expression microarray data presents another major challenge due to the number of genes far exceeding the number of samples.So a method of extracting DNA microarray data features for the tissue classification is proposed.The method makes use of K-means to cluster analysis for genes,getting the DNA microarray data centers of every subclass,then uses ranking methods to get grid of the genes not useful for classification.Then,the features of the remaining subclass sets are extracted by ICA,thus a classifier is structured by SVMs for tissues classification.Real biological data experiments show that the method can evaluate the classification capacity of genes,decrease the number of features and increase the classification accuracy of the existing classifiers by extracting a compound gene.

作者彭红毅叶燕锐张俊辉罗泽举奉国和

机构地区华南农业大学理学院统计系华南理工大学生物科学与工程学院重庆工商大学计算机科学与信息工程学院华南师范大学经济管理学院

出处《计算机工程与应用》 CSCD 北大核心 2010年第28期40-42,共3页 Computer Engineering and Applications

基金国家社会科学基金No.08CTQ003 广东省自然科学基金No.2008276 华南农业大学校长基金No.4900-K06166 重庆市科委重点攻关项目No.2008AC0043~~

关键词 DNA微阵列特征提取独立成分分析(ICA) 聚类分析支持向量机(SVMs) DNA microarray extracting feature Independent Components Analysis （ICA） clustering analysis Support Vector Machines （SVMs）

分类号 TP311 [自动化与计算机技术—计算机软件与理论]

引文网络
相关文献

参考文献13

1Schena M, Shalon D,Davis R W,et al.Quantitative monitoring of gene expression patterns with a complementary DNA microarray[J].Science, 1995,270( 5235 ) : 467-470.
2Lockhart D J,Dong H,Byrne M C,et al.Expression monitoring by hybridization to high-density oligonucleotide arrays[J].Nat Biotechnol, 1996,14(13) : 1675-1680.
3Khan J,Wei J S,Ringner M,et al.Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks[J].Nature Medicine,2001,7(6):673-679.
4Paul T K, Iba H.Prediction of cancer class with majority voting genetic programming classifier using gene expression data[J]. IEEE-ACM Transactions on Computational Biology and Bioinformatics, 2009,6(2) : 353-367.
5Buldini B, Zangrando A.Identification of immunophenotypie signatures by clustering analysis in pediatric patients with philadelphia chromosome-positive acute lymphoblastic leukemia[J].American Journal of Hematology, 2010,85 (2) : 138-141.
6Mei Z, Shen Q.Ye B X.Hybridized KNN and SVM for gene expression data classification[J].Life Science Journal, 2009, 6 (3) :61-66.
7Hong J H, Cho S B.Gene boosting for cancer classification based on gene expression profiles[J].Pattem Recognition, 2009, 42(9) : 1761-1767.
8Dudoit S, Fridlyand J, Speed T.Comparison of discrimination methods for the classification of tumors using gene expression data[J].Journal of the American Statistical Association, 2002,97 : 77-87.
9Li L H, Zhang J G, Neal R M.A method for avoiding bias from feature selection with application to naive Bayes classification models[J].Bayesian Analysis, 2008,3 ( 1 ) : 171-196.
10王明怡,吴平,王德林.基于相关性分析的基因选择算法[J].浙江大学学报（工学版）,2004,38(10):1289-1292. 被引量：4

二级参考文献37

1彭红毅,朱思铭,蒋春福.数据挖掘中基于ICA的缺失数据值的估计[J].计算机科学,2005,32(12):203-205. 被引量：9
2彭红毅,蒋春福,朱思铭.基于ICA与SVM的孤立点挖掘模型[J].计算机科学,2006,33(9):175-177. 被引量：7
3Cotes C,Vapnik V.Support vector networks[J].Machine Learning, 1995,20: 273-295.
4Bartlett P L,Taylor J S.Generalization performance on support vector machines and other pattern classifiers[M].Cambridge,MA: MIT Press, 1999.
5Sholkopf B,Sung K,Burges C J C,et al.Comparing support vector machine with Gaussian kernels to radial basis function classifiers[J]. IEEE Trans Signal Processing, 1997,45:2758-2765.
6Vapnik V N.Statistical learning theory[M].[S.l.]:Publishing House of Electronics Industry,2004.
7Sundararaghvan V,Zabaras N.Classification and reconstruction of three-dimensional microstructures using support vector machinos[J]. Computational Materials Science,2005,32:223-239.
8Yao Y,Marcialis G.Combining flat and structured representations for fingerprint classification with recursive neural networks and support vector machines[J].Pattern Recognition,2003,36:397-406.
9Zhan Y,Shen D.Design efficient support vector machine for fast classification[J].Pattern Recognition, 2005,38 : 157-161.
10Rai Y.A simplified approach to independent component analysis[J]. Neural Comput & Applic,2003,12:173-177.

共引文献14

1彭红毅,蒋春福,朱思铭.基于ICA与SVM的孤立点挖掘模型[J].计算机科学,2006,33(9):175-177. 被引量：7
2彭红毅,蒋春福,朱思铭.一种改进的高维数据可视化模型[J].计算机科学,2007,34(4):175-178. 被引量：4
3彭红毅,蒋春福,朱思铭.基于ICA与ViSOM的不完整数据处理[J].计算机科学,2007,34(7):174-177.
4奉国和,彭红毅,蒋春福,杜明.基于ICA与SOM的不完整数据处理[J].计算机工程与应用,2008,44(4):166-168.
5彭红毅,蒋春福,杜明.基于ICA与聚类分析的支持向量机分类研究[J].计算机工程与应用,2008,44(8):169-171. 被引量：4
6张丽娟,李舟军.微阵列数据癌症分类问题中的基因选择[J].计算机研究与发展,2009,46(5):794-802. 被引量：19
7王蒙,王雅洁,杨丕仁,杨润标.基于独立成分分析的自适应图像滤波算法[J].大理学院学报（综合版）,2010,9(4):30-33. 被引量：2
8王娟,贺兴时,赵飞军.基于对应分析的支持向量机分类研究[J].四川理工学院学报（自然科学版）,2010,23(5):508-510. 被引量：2
9邹薇,王会进.基于朴素贝叶斯的EM缺失数据填充算法[J].微型机与应用,2011,30(16):75-77. 被引量：7
10黄媛媛,张尤赛.双树复小波域共生矩阵的纹理特征提取方法[J].计算机应用与软件,2012,29(7):216-219. 被引量：6

同被引文献8

1崔光照,李小广,张勋才,王延峰,李翠玲.基于改进的粒子群遗传算法的DNA编码序列优化[J].计算机学报,2010,33(2):311-316. 被引量：27
2梁旭,蔡丽,黄明.改进DNA遗传算法求解车间调度问题[J].大连交通大学学报,2010,31(4):95-97. 被引量：2
3梁冰,陈德运.基于蚁群优化聚类算法的DNA序列分类方法[J].计算机工程与应用,2010,46(25):124-126. 被引量：2
4付媛媛,张大方,向旭宇.基于人工鱼群的DNA编码序列组合优化算法研究[J].湖南城市学院学报（自然科学版）,2011,20(2):54-59. 被引量：2
5魏若岩,綦朝晖.一种新的DNA序列3D表示方法及相似分析[J].计算机工程与应用,2012,48(4):146-148. 被引量：2
6邬月春.基于自适应变异粒子群算法的物流配送路径优化[J].兰州交通大学学报,2012,31(1):114-117. 被引量：9
7胡慧,何聚厚.基于改进蚁群算法的协作学习分组研究[J].计算机工程与应用,2014,50(13):137-141. 被引量：8
8王子成,赵晓航,王宏,崔光照.基于DNA密码的一次一密加密算法[J].计算机工程与应用,2014,50(15):97-100. 被引量：6

引证文献1

1谭莉,应石.多目标优化机制下DNA编码序列模型[J].计算机工程与应用,2016,52(15):34-37.

1王明怡,王德林,黄金钟.基于概率神经网络的基因选择和组织分类方法[J].科技通报,2005,21(1):10-13. 被引量：2
2王明怡,吴平,王德林.基于相关性分析的基因选择算法[J].浙江大学学报（工学版）,2004,38(10):1289-1292. 被引量：4
3刘羿,宋雨,刘海.一种组件复用与维护的新方法[J].信息技术,2002,26(11):2-4.
4贺锐.Q＆A电脑安全[J].网上俱乐部（电脑安全专家）,2004(10):121-124.
5陈静,刘洋.基于最小熵的流形学习排列方法[J].广东工业大学学报,2015,32(3):39-45. 被引量：1
6肖伯祥,张强,魏小鹏.人体运动捕捉数据特征提取与检索研究综述[J].计算机应用研究,2010,27(1):10-13. 被引量：6
7王维.基于物联网的智慧医疗数据融合研究[J].无线互联科技,2012,9(8):31-31. 被引量：1
8魏峻.一种有效的DNA微阵列数据特征基因提取方法[J].现代电子技术,2014,37(13):95-98.
9高晓静,徐学文,智勇.小波包分析在目标光亮度数据特征提取中的应用[J].微机发展,2005,15(2):1-5. 被引量：3
10廖海斌,徐洪章.基于鉴别主成份分析的基因表达数据特征提取[J].燕山大学学报,2010,34(5):426-430. 被引量：2

计算机工程与应用

2010年第28期

浏览历史

内容加载中请稍等...

DNA微阵列数据特征提取的分类方法研究被引量：1

参考文献13

二级参考文献37

共引文献14

同被引文献8

引证文献1

相关作者

相关机构

相关主题

浏览历史

DNA微阵列数据特征提取的分类方法研究 被引量：1

参考文献13

二级参考文献37

共引文献14

同被引文献8

引证文献1

相关作者

相关机构

相关主题

浏览历史

DNA微阵列数据特征提取的分类方法研究被引量：1