期刊文献+

一种基于综合信息的剪接位点识别方法 被引量:2

Identification method of splice sites using comprehensive information
原文传递
导出
摘要 为提高剪接位点识别的精度,提出一种基于综合信息的剪接位点识别方法.通过分析供体位点与受体位点的剪接信号、剪接序列、位点附近序列的二级结构,以及剪接因子作用过程等特征,分别为供体位点与受体位点建立信号模型和序列模型;应用Vienna软件中的Mfold包预测每个剪接位点附近序列最稳定的二级结构,将传统的四字符核酸表转化为八字符核酸表,每个序列用八字符进行描述,用结合了结构信息的序列对信号模型和序列模型进行训练学习;最后用训练好的模型进行剪接位点的识别.实验结果证明:该方法对剪接位点的识别取得了很好的效果,其识别精度可达95%以上. To identify splice sites more accurately and efficiently, a method to recognize splice sites based on comprehensive information was proposed. By analyzing the splicing signals, splicing sequences, secondary structures of flank sequence, different splicing factor mechanism of action and other characteristics of donor sites and acceptor sites, donor sites identification signal model, acceptor sites identification signal model, donor sites identification sequence model and acceptor sites identification sequence model were built, respectively. Then the Mfold package in Vienna soft was used to predict the most stable secondary structure of flank sequences. The traditional four-letter alphabet was converted into eight-letter alphabet sequence. The sequence-structure combination strings were used for training signal models and sequence models, and then well trained models were applied to recognize splice sites. Results show that the accuracy of splice site recognition is beyond 95%, suggesting that the method has great potential to achieve a good performance for splice sites identification.
出处 《华中科技大学学报(自然科学版)》 EI CAS CSCD 北大核心 2011年第3期111-114,共4页 Journal of Huazhong University of Science and Technology(Natural Science Edition)
基金 国家自然科学基金资助项目(61071174) 国家高技术研究发展计划资助项目(2008AA01Z148) 黑龙江省杰出青年科学基金资助项目(JC200703)
关键词 生物信息学 剪接位点 剪接信号 可变剪接 二级结构 bioinformatics; splice sites; splice signal; alternative splice; secondary structures;
  • 相关文献

参考文献10

  • 1Black D L. Mechanisms of alternative pre-messenger RNA splicing[J]. Annual Reviews of Bioehemistry, 2003, 72(1): 291-336.
  • 2龙伟,周艳红.基于序列特征预测先天性糖基化紊乱疾病基因[J].华中科技大学学报(自然科学版),2009,37(8):120-124. 被引量:1
  • 3闻芳,卢欣,孙之荣,李衍达.基于支持向量机(SVM)的剪接位点识别[J].生物物理学报,1999,15(4):733-739. 被引量:19
  • 4Wang E T, Sandberg R, Luo S, et al. Alternative isoform regulation in human tissue transeriptomes [J]. Nature, 2008,456(7221): 470-476.
  • 5Pertea M, Lin X Y, Salzberg S L. GeneSplicer: a new computational method for splice site prediction [J]. Nucleic Acids Res, 2001, 29(5): 1185-1190.
  • 6Hiller M, Zhang Z, Backofen R, et al. Pre-mRNA secondary structure and splice site selection [J]. PLOS Genet, 2007, 3(1): 2147-2155.
  • 7Buratti E, Baralle F E. Influence of RNA secondary structure on the pre-mRNA splicing process[J]. Mol Cell Biol, 2004, 24(24): 10505-10514.
  • 8Reese M G, Eeckman F H, Kulp D. Improved splice site detection in genie[J]. Journal of Computational Biology, 1997, 4(3): 311-323.
  • 9Kim E, Goren A, Ast G. Alternative splicing: cur- rent perspectives[J]. Bioessays, 2008, 30(1): 38- 47.
  • 10Brendel V, Kleffe J. Prediction of locally optimal splice sites in plant pre-mRNA with applications to gene identification in Arabidopsis thaliana genomic DNA[J]. Nucleic Acids Res, 1998, 26(20): 4748- 4757.

二级参考文献16

  • 1孙键,徐军,凌伦奖,沈如群,陈润生.用神经网络法预测mRNA的剪接位点[J].生物物理学报,1993,9(1):127-131. 被引量:7
  • 2郑毅,丁达夫.果蝇内含子3'剪接位点的选择机制[J].生物物理学报,1994,10(3):459-464. 被引量:6
  • 3Freeze H HI Update and perspectives on congenital disorders of glycosylation[J]. Glyeobiology, 2001, 11(3) : 129-143.
  • 4Aebi M, Hennet T. Congenital disorders of glycosylation: genetic model systems lead the way [J]. Trends Cell Biol, 2001, 11(3): 136-141.
  • 5Schachter H. Congenital disorders involving defective N-glycosylation of proteins[J]. Cell Mol Life Sci, 2001, 58(8): 1 085-1 104.
  • 6Perez-Iratxeta C, Bork P, Andrade-Navarro M A. Update of the G2D tool for prioritization of gene candidates to inherited diseases[J]. Nucleic Acids Res,2007, 35(Web Server Issue) : W212-6.
  • 7Sprinzak E, Margalit H. Correlated sequence-signatures as markers of protein-protein interaction[J]. J Mol Biol, 2001, 311(4): 681-692.
  • 8Li Z R, Lin H H, Han L Y, et al. PROFEAT.. a web server for computing strtlctural and physicochemical features of proteins and peptides from amino acid sequence[J]. Nucleic Acids Res, 2006, 34(Web Server Issue) : W32-37.
  • 9Dubchak I, Muchnik I, Holbrook S R, et al. Prediction of protein folding class using global description of amino acid sequence[J]. Proc Natl Acad Sci U S A, 1999, 92(19): 8 700-8 704.
  • 10Dobson P D, Cai Y D, Stapley B J, et al. Prediction of protein function in the absence of significant se- quence similarity [J]. Curr Med Chem, 2004, 11(16): 2 135-2 142.

共引文献18

同被引文献1

引证文献2

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部