期刊文献+

基于序列模式特征和SVM的剪切位点预测 被引量:2

Splice Site Prediction Based on Characteristics of Sequence Motif and Support Vector Machine
下载PDF
导出
摘要 通过对HS3D数据集供点序列碱基的统计分析,利用供体位点邻域碱基出现规律构造模式(motif)作为DNA序列的属性。设置序列属性值将字符序列映射成数字向量,应用支撑向量机进行实验,实现对供体位点的预测分类。实验结果表明,与改进的motif得分模型方法相比,该文方法可有效去除数据中异常数据对分类的影响,将DNA字符序列变换到motif属性数字序列空间具有有效性和实用性。 Through statistic analysis on the donor site sequences in the dataset of HS3D, the rules that the bases appear in the adjacent sites around the splice sites are used for constructing motifs, which are then utilized as the attributes of the DNA sequences. And by setting the value of each attribute the literal sequences are transformed into numeric vectors, based on which a Support Vector Machine(SVM) model is constructed to predict splice sites. The experimental results indicate that compared with the improved motif scoring model, the proposed method has diminished the influence on the prediction generated by the abnormal data effectively and also shows that the new mapping method in virtue of motifs is practicable and effectual.
出处 《计算机工程》 CAS CSCD 北大核心 2009年第5期180-182,共3页 Computer Engineering
基金 国家自然科学基金资助项目(60774086 60373107)
关键词 序列模式 剪切位点 支撑向量机 sequence motif splice site Support Vector Machine(SVM)
  • 相关文献

参考文献6

  • 1Yang Zhengrong. Decision Trees: a Novel Method for Decisive Template Selection-mining SARS-CoV Protease Cleavage Data Using Non-orthogonal[J]. Bioinformatics, 2005, 21: 2644-2650.
  • 2李冬冬,王正志,杜耀华,晏春.DNA序列中模式发现的一种快速算法[J].生物物理学报,2005,21(2):121-129. 被引量:3
  • 3Maisheng Y, Jason T L W. Algorithms for Splicing Junction Donor Recognition in Genomic DNA Sequences[C]//Proc. of IEEE International Joint Symposia on Intelligence and Systems. [S. l.]: IEEE Computer Society, 1998:169-176.
  • 4Salvatore R. HS3D, a Dataset of Homo Sapiens Splice Regions, and Its Extraction Procedure from a Major Public Database[J]. International Journal of Modern Physics C, 2002, 13(8): 1105-1117.
  • 5Wren J D, William H H, Chandrasekaran S, et al. Markov Model Recognition and Classification of DNA/Protein Sequence Within LargeText Databases[J]. Bionformatics, 2005, 21(21): 4046- 4053.
  • 6雷静,阮晓钢.DNA序列与剪接位点的关联性分析[J].北京工业大学学报,2004,30(3):295-298. 被引量:1

二级参考文献21

  • 1[1]RAMPONE Salvatore. Recognition of splice junctions on DNA sequences by BRAIN learning algorithm[J].Bioinformatics, 1998, 14(8): 676-684.
  • 2[2]SUN Ying-fei, FAN Xiao-dan, LI Yan-da. Identifying splicing sites in eukaryotic RNA: Support veetor machine approach[J]. Computersuin Biology and Medione, 2003, 33, 17-29.
  • 3[5]OGURA Hisakazu, HIDEYUKI Agata A study of leaming splice sites of DNA sequence by neural networks[J].Comput Biol Med, 1997, 27(1): 67-75.
  • 4Pevzner PA, Sze S. Combinatorial approaches to finding subtle signals in DNA sequences. In: Philip EB, Michacl G,Russ BA, Nancy J, Debra H, Thomas L, Julie CM, Eric DS,Chris S, Shawn S, Helge W. Proceeding of the 8th International Conference on Intelligent Systems for Molecular Biology. San Diego: AAAI Press, 2000. 269~278.
  • 5Hertz G, Stormo G. Identifying DNA and protein patterns with statistically significant alignments of multiple sequences.Bioinformatics, 1999,15(7):563~577.
  • 6Bailey T, Elkan C. Unsupervised learning of multiple motifs in biopolymers using expectation maximization. Machine Learning, 1995,21(1-2):51~80.
  • 7Lawrence C, Altschul S, Bognski M, Liu J, Neuwald A,Wootton J. Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment. Science, 1993,262(5131):208~214.
  • 8Buhler J, Tompa M. Finding motifs using random projections.J Comput Biol, 2002,9(2):225~242.
  • 9Keich U, Pevzner PA. Finding motifs in the twilight zone.Bioinformatics, 2002,18(10):1374~1381.
  • 10Keich U, Pevzner PA. Subtle motifs: defining the limits of motif finding algoritms. Bioinformatics, 2002,18 (10):1382~1390.

共引文献2

同被引文献20

  • 1陆从德,张太镒,胡金燕.基于乘性规则的支持向量域分类器[J].计算机学报,2004,27(5):690-694. 被引量:21
  • 2胡正平,张晔.带拒识能力的双层支持向量模型分类器[J].电子学报,2005,33(7):1200-1203. 被引量:3
  • 3曹胜玉,刘来福.隐马模型及其在基因识别中的应用[J].数学的实践与认识,2006,36(9):212-218. 被引量:2
  • 4杨乌日吐,李前忠,刘利,樊国梁.用支持向量机预测人类基因5′/3′选择性剪切位点[J].现代生物医学进展,2007,7(5):790-792. 被引量:2
  • 5Vapnik V N. An Overview of Statistical Learning Theory[J]. IEEE Trans. on Neural Networks, 1999, 10(3): 988-999.
  • 6Tax D M J. One-class Classification: Concept-learning in the Absence of Counter-examples[D]. Delf, USA, Faculty of Information Technology, Delf University of Technology, 2001-06.
  • 7Tax D M J, Duin R P W. Support Vector Data Description[J]. Machine Learning, 2004, 54(1): 45-66.
  • 8DASH D, GOPALAKRISHNAN V. Modeling DNA splice regions by learning Bayesian networks[C]. CBMI Tech Report, 2001,11:33-40.
  • 9GUYON I, WESTON J, BARNHILL S, et al. Gene selection for cancer classification using support vector machines[J].Machine Learning, 2000, 46(1-3): 389-422.
  • 10SUN Y F, FAN X D, LI Y D. Identifying splicing sites in eukaryotic RNA: support vector machine approach [J].Computers in Biology and Medicine, 2003, 33:17- 29.

引证文献2

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部