提取基因标签的方法及识别率提高的方法

The methods for extracting genetic tags and improving recognition rate

下载PDF

导出

摘要目的用较少的基因标签准确地来识别结肠癌患者。方法根据基因之间的相关性,采取模糊聚类分析法对大量基因进行聚类。引入各基因与基因中心向量的距离建立优化模型。其次根据基因在样本中的分布特征将基因分为突变基因和无关基因。综合基因的这两个特征建立优化模型。为了提高识别率,采用蒙特卡罗方法考虑了基因中的噪声。最后,考虑到已知的基因标签的特征,重新建立了优化模型。结果在不考虑噪声时,得到8个基因标签,正确识别率为72.6%;加入噪声之后正确识别率为85.00%;加入已知基因标签之后正确识别率为87.1%;加入符合已知基因标签特征的全部基因标签得到25个基因标签,识别率提高到了96.7%。结论考虑的基因特征越多,正确识别率越高。 Aim To identify colon cancer patients accurately by using less genetic tags. Methods According to the correlations in variety of genes, the fuzzy cluster analysis method is utilized in clustering large quantities of genes, distence between the single gene and the gene center vector is defined to build an optimization model. On the basis of the distribution features of genes in sample, genes are divided into mutant and irrelevant genes. The two features are taken together to establish an optimization model. To increase the recognition rate, the Monte Carlo method is used when considering noise in genes. Finally, the optimization model is again established on the basis of features about the known genes tags. Results Without considering noise, the recognition rate is 72. 6 percent by u- sing 8 genetic tags identify samples; considering noise, the recognition rate increases to 85 percent; when the known genes tags are considered, the recognition rate is again increased to 87.1 percent; and aceordingto all genes owning characteristics of the known genetic tags, 25 genetic tags are got, the recognition rate increases to 96.7 per- cent. Conclusion The more features of genes are considered, the higher the recognition rate is.

作者王晓玲窦霁虹王勇王劭寅

机构地区西北大学数学系

出处《西北大学学报（自然科学版）》 CAS CSCD 北大核心 2012年第5期713-718,共6页 Journal of Northwest University（Natural Science Edition）

基金陕西省教育厅科研基金资助项目(11JK0511)

关键词基因特征模糊聚类分析相关性基因标签蒙特卡罗方法识别率 genetic characteristic fuzzy clustering analysis correlation genetic tags Monte Carlo method recog-nition rate

分类号 O29 [理学—应用数学]

引文网络
相关文献

参考文献10

1GOLUB T R, SLONIM D K, TAMAYO P. Molecular clas- sification of cancer: Class discovery and class prediction by gene expression monitoring [ J ]. The World's Leading Journal of Original Scientific Research, Global News, and Commentary, 1999,286 (5439) :531-537.
2ALON U, BARKAI N, NOTI'ERMAN D A. Broad patterns of gene expression revealedby by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays [ J ]. Proceedings of National Academy of Sciences of the United States of America, 1999,96 ( 12 ) : 6745- 6750.
3ALIZADEH A, EISEN M B, DAVIS R E. Distinct types of diffuse large b-cell lymphoma identified by gene ex- pression profiling[ J]. International Weekly Journal of Sci- ence, 2000,403 ( 3 ) : 503-511.
4DINESH S, PHILLIP G F, KENNETH R. Gene expres- sion correlates of clinical prostate cancer behavior [ J ]. Cancer Cell, 2002,1 (2) :203-209.
5谢季坚,刘承平.模糊数学方法及其应用[M].武汉:华中科技大学出版社,2005.
6贺宪民 ,武建虎 ,贺佳 ,XIANG Zhaoying .小样本情况下差异表达基因鉴别的参数统计分析[J].中国卫生统计,2005,22(3):141-145. 被引量：10
7邓林,马尽文,裴健.秩和基因选取方法及其在肿瘤诊断中的应用[J].科学通报,2004,49(13):1311-1316. 被引量：18
8李颖新,刘全金,阮晓钢.急性白血病的基因表达谱分析与亚型分类特征的鉴别[J].中国生物医学工程学报,2005,24(2):240-244. 被引量：19
9孙越,王玥,张春,王志华.基于定点DSP的实时噪声消除系统[J].半导体技术,2002,27(6):57-61. 被引量：3
10尹增谦,管景峰,张晓宏,曹春梅.蒙特卡罗方法及应用[J].物理与工程,2002,12(3):45-49. 被引量：102

二级参考文献35

1Duda OR,Hart PE,Stork GD.Pattern Classification[M].Second Edition.New York:John wiley & Sons 2001:46-48.
2Theodoridis S,Koutroumbas K.Patter Recognition[M].Second Edition.New York:Academic Press, 2003,177-179.
3Padil P,Novovicova J,Kittler J.Floating search method in feature selection[J].Pattern Recognition Letters,1994,15(11):1119-1125.
4Vapnik VN.Statistical Learning Theroy[M].New York:Wiley Interscience, 1998.
5Ramaswamy S,Golub TR.DNA microarrays in clinical oncology[J].Journal of Clinical Oncology,2002,20(7):1932-1941.
6Lander ES,Weinberg RA.GENOMICS:journey to the center of biology[J].Science,2000,287(5459):1777-1782.
7Lander ES.Array of hope[J].Nature Genetics,1999,21(supp.1):3-4.
8Golub TR,Slonim DK,Tamayo P,et al.Molecular classification of cancer:class discovery and class prediction by gene expression monitoring[J].Science,1999,286(5439):531-537.
9Khan J,Wei JS,Ringner M,et al.Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks[J].Nat Med,2001,7(6):673-679.
10Tibshirani R,Hastie T,Narasimhan B,et al.Diagnosis of multiple cancer types by shrunken centroids of gene expression[J].PNAS,2002,99(10):6567-6572.

共引文献238

1刘敏珊,王志彬,董其伍,靳遵龙.蒙特卡罗技术在换热网络工程成本预测中的应用[J].化工进展,2006,25(z1):506-508.
2刘小红,肖保根.我到银行去“找钱”[J].职业圈,2005(10):15-12.
3李烨,王永丽,贺国平.基于支持向量机的结肠癌信息基因提取[J].山东科技大学学报（自然科学版）,2012,31(3):84-89. 被引量：3
4王一任,孙振球,谢江波,曾小敏.医学科研中样本资料的综合评价问题[J].中南大学学报（医学版）,2014,39(4):416-422. 被引量：2
5夏遥,孔薇.基于小波包变换的基因微阵列数据预处理方法[J].现代生物医学进展,2011,11(S1):4742-4747.
6米军,韩瑞峰.随机数发生器对蒙特卡罗算法求解定积分的影响[J].电脑开发与应用,2004,17(10):11-12. 被引量：4
7马尽文,邓明华.第五讲生物医学信息处理——DNA微阵列数据在医学中的应用[J].物理,2005,34(5):371-380.
8葛菲,马尽文.基于信息准则的基因选取方法及其在肿瘤诊断中的应用[J].信号处理,2005,21(3):312-315. 被引量：1
9贺宪民,武建虎,贺佳,XIANG,Zhaoying.差异表达基因鉴别的SAM和RVM的比较[J].中国卫生统计,2005,22(4):210-213.
10阮晓钢,李颖新,李建更,龚道雄,王金莲.基于基因表达谱的肿瘤特异基因表达模式研究[J].中国科学（C辑）,2006,36(1):86-96. 被引量：5

1Yunfei Guo,Zhe Yin.Research on Gene Expression Profiles Based on Principal Component and Cluster Analysis[J].信息工程期刊（中英文版）,2015,5(2):33-38.
2童姗姗,窦霁虹,贾玲,王佳颖.基于实例的基因分类及确定基因标签模型[J].纯粹数学与应用数学,2011,27(4):515-522. 被引量：1
3魏建荣.创新大学数学教育,探讨提高数学教学质量的方法[J].青春岁月,2012,0(2):107-107.
4赵亚宁,赵彦晖.基于统计分析法的肿瘤特征基因提取和分类研究[J].襄樊学院学报,2011,32(8):13-16.
5郑杰,郭呈全,程俊荣,高利新.基于基因表达谱的肿瘤识别与分类特征基因提取研究[J].数学的实践与认识,2011,41(14):67-79.
6刘玉娜.班组产品质量控制及提高的方法探讨[J].科技咨询导报,2007(23):98-98. 被引量：2
7张弢,纪德云.模糊聚类分析法[J].沈阳大学学报,2000,12(2):73-79. 被引量：25
8陈升,李星野,马海娟.基于秩和检验与SVM的基因特征选取与分类方法[J].生物数学学报,2012,27(2):349-356. 被引量：1
9易波,文天柱,张原.生理学信息在基因标签提取中的应用[J].兵工自动化,2011,30(7):48-51.
10董媛媛,黄琼.模糊聚类分析在高校学生成绩评价中的应用[J].吉林省教育学院学报（中旬）,2015,31(3):46-47. 被引量：7

西北大学学报（自然科学版）

2012年第5期

浏览历史

内容加载中请稍等...

提取基因标签的方法及识别率提高的方法

参考文献10

二级参考文献35

共引文献238

相关作者

相关机构

相关主题

浏览历史