期刊文献+

基于支持向量机的生物序列分析

Analysis of Biology Sequences Based on Support Vector Machine
下载PDF
导出
摘要 支持向量机是一种比较新的机器学习方法,它满足结构风险最小的要求,并且能够适用于高维的特征空间,因此在生物序列分析中得到了广泛地应用。结合基因序列的特点,提出了一种新的核函数--位置权重子序列核函数。这个核函数融合了基因序列中子序列的组成特征和位置信息,能够比较充分地体现序列特征。将这个核函数用于基因剪接位点的识别分析,得到的结果表明,采用了位置权重子序列核函数的支持向量机能够很好的识别剪接位点,与其它方法相比,取得了更高的识别精度。 Support vector machine is a relatively new addition to machine learning, which satisfies structural risk minimization and has the ability to deal with a large number of features. It is widely used in analysis of biology sequences. By incorporating the characteristic of gene sequences, a novel kernel, namely position weight subsequences kernel, is proposed. This kernel syncretizes the composition and the position information of subsequences, and can sufficiently express the characteristic of gene sequences. This kernel is used for splice sites identification and the experimental results demonstrated that support vector machine with position weight subsequence kernel can identify splice sites effectively. Compared with other methods, our method achieved better precision.
作者 晏春 王正志
出处 《计算机仿真》 CSCD 2006年第9期69-71,共3页 Computer Simulation
基金 国家自然科学基金(60471003)
关键词 支持向量机 核函数 生物序列分析 Support vector machine (SVM) Kernel Biology sequences analysis
  • 相关文献

参考文献9

  • 1M Shapiro and P Senapathy. RNA splice junctions of different classes of eucaryotes : Sequence statistics and functional implications in gene expression. Nucleic Acids Res[J]. 1987,15:7155 -7174.
  • 2S Brunak, J Engelbrecht and S Knudsen. Prediction of human mRNA donor and acceptor sites from the DNA sequence[J]. J. Mol.Biol. 1991, 220:49 -65.
  • 3D Cai, A Delcher, B Kao and S Kasif. Modeling splice sites with Bayes networks[ J ] Bioinformatics. 2000, 16 : 152 - 158.
  • 4C Burge and S Karlin. Prediction of complete gene structures inhuman genomic DNA[J]. J. Mol. Biol. 1997, 268:78-94.
  • 5M Pertea, X Y Lin and S L Salzberg. GeneSplicer: a new computational method for splice site prediction[ J ]. Nucleic Acids Res.2001, 29:1185 - 1190.
  • 6S Sonnenburg. New methods for splice site recognition [M]. Berlin. 2002.
  • 7V N Vapnik. Statistical learning theory[M]. Wiley. 1998.
  • 8H Lodhi, C Sauders, J Shawe - Taylor, N Cristianini, C Watkins.Text classification using string kernels [ J ]. Journal of Machine Learning Research. 2002, 2:419 - 444.
  • 9S Saxonov I Daizadeh, A Fedorov and W Gilbert. An exhaustive database of protein - coding intron - containing genes[ J ]. NucleicAcids Research. 2000, 24 : 3439 - 3452.

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部