期刊文献+

提取基因标签的方法及识别率提高的方法

The methods for extracting genetic tags and improving recognition rate
下载PDF
导出
摘要 目的用较少的基因标签准确地来识别结肠癌患者。方法根据基因之间的相关性,采取模糊聚类分析法对大量基因进行聚类。引入各基因与基因中心向量的距离建立优化模型。其次根据基因在样本中的分布特征将基因分为突变基因和无关基因。综合基因的这两个特征建立优化模型。为了提高识别率,采用蒙特卡罗方法考虑了基因中的噪声。最后,考虑到已知的基因标签的特征,重新建立了优化模型。结果在不考虑噪声时,得到8个基因标签,正确识别率为72.6%;加入噪声之后正确识别率为85.00%;加入已知基因标签之后正确识别率为87.1%;加入符合已知基因标签特征的全部基因标签得到25个基因标签,识别率提高到了96.7%。结论考虑的基因特征越多,正确识别率越高。 Aim To identify colon cancer patients accurately by using less genetic tags. Methods According to the correlations in variety of genes, the fuzzy cluster analysis method is utilized in clustering large quantities of genes, distence between the single gene and the gene center vector is defined to build an optimization model. On the basis of the distribution features of genes in sample, genes are divided into mutant and irrelevant genes. The two features are taken together to establish an optimization model. To increase the recognition rate, the Monte Carlo method is used when considering noise in genes. Finally, the optimization model is again established on the basis of features about the known genes tags. Results Without considering noise, the recognition rate is 72. 6 percent by u- sing 8 genetic tags identify samples; considering noise, the recognition rate increases to 85 percent; when the known genes tags are considered, the recognition rate is again increased to 87.1 percent; and aceordingto all genes owning characteristics of the known genetic tags, 25 genetic tags are got, the recognition rate increases to 96.7 per- cent. Conclusion The more features of genes are considered, the higher the recognition rate is.
机构地区 西北大学数学系
出处 《西北大学学报(自然科学版)》 CAS CSCD 北大核心 2012年第5期713-718,共6页 Journal of Northwest University(Natural Science Edition)
基金 陕西省教育厅科研基金资助项目(11JK0511)
关键词 基因特征 模糊聚类分析 相关性 基因标签 蒙特卡罗方法 识别率 genetic characteristic fuzzy clustering analysis correlation genetic tags Monte Carlo method recog-nition rate
  • 相关文献

参考文献10

二级参考文献35

  • 1Duda OR,Hart PE,Stork GD.Pattern Classification[M].Second Edition.New York:John wiley & Sons 2001:46-48.
  • 2Theodoridis S,Koutroumbas K.Patter Recognition[M].Second Edition.New York:Academic Press, 2003,177-179.
  • 3Padil P,Novovicova J,Kittler J.Floating search method in feature selection[J].Pattern Recognition Letters,1994,15(11):1119-1125.
  • 4Vapnik VN.Statistical Learning Theroy[M].New York:Wiley Interscience, 1998.
  • 5Ramaswamy S,Golub TR.DNA microarrays in clinical oncology[J].Journal of Clinical Oncology,2002,20(7):1932-1941.
  • 6Lander ES,Weinberg RA.GENOMICS:journey to the center of biology[J].Science,2000,287(5459):1777-1782.
  • 7Lander ES.Array of hope[J].Nature Genetics,1999,21(supp.1):3-4.
  • 8Golub TR,Slonim DK,Tamayo P,et al.Molecular classification of cancer:class discovery and class prediction by gene expression monitoring[J].Science,1999,286(5439):531-537.
  • 9Khan J,Wei JS,Ringner M,et al.Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks[J].Nat Med,2001,7(6):673-679.
  • 10Tibshirani R,Hastie T,Narasimhan B,et al.Diagnosis of multiple cancer types by shrunken centroids of gene expression[J].PNAS,2002,99(10):6567-6572.

共引文献238

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部