期刊文献+

利用多样性增量位置得分函数预测人类5'非翻译区剪接位点 被引量:1

PREDICTION OF SPLICE SITES IN HUMAN 5'UTRs BY USE OF POSITION SCORE FUNTION BASED ON INCREMENT OF DIVERSITY
下载PDF
导出
摘要 5’非翻译区中的剪接位点两侧不存在由编码区到非编码区的状态转换,所以通常的识别剪接位点的算法在非翻译区的性能不太理想.本文把多样性增量的位置得分函数应用到5’非翻译区剪接位点的识别中.对于供体端,正负集样本数之比为1∶17,识别敏感性为66.91%,阳性预报值为68.54%,总精度为96.45%,ROC曲线下面积为97.23%;对于受体端,正负集样本数之比为1:24,识别敏感性为77.19%,阳性预报值为29.37%,总精度为91.78%,ROC曲线下面积为93.91%.这一结果要好于已有相似算法. As there exists no translation from protein coding to non-coding in human 5' untranslated regions (ITRs),conventional splice site prediction methods do not perform well with UTRs. In this paper, position score function based on increment of diversity is used to predict splice sites in 5'UTRs. Results show that with the donor sites,the positive set in proportion to the negative is 1 : 17,the sensitivity--66.91% ,the precision--68.54% ,the accuracy--96.45%and the area under the Receiver Operator Characteristics curve--97.23 %. While with the acceptor sites ,the positive set in proportion to the negative is 1 : 24,the sensitivity--77. 19% ,the precision--29. 37N,the accuracy- 91.78% and the area under the Receiver Operator Characteristics curve--93. 91%. Keyworfls: 5' untranslated regions;recognition of splice sites;position score fanction base on increment of diversity
作者 陈丽萍 吕军
出处 《内蒙古工业大学学报(自然科学版)》 2009年第4期274-278,共5页 Journal of Inner Mongolia University of Technology:Natural Science Edition
基金 内蒙古工业大学校重点基金项目(ZD200607)
关键词 5’非翻译区 剪接位点识别 多样性增量位置得分函数 5' untranslated regions recognition of splice sites position score fanction base on increment of diversity
  • 相关文献

参考文献11

  • 1Brunak S, Engelbrecht J. Knudsen S. Prediction of Human m RNA Donor and Acceptor Sites from the DNA Sequence [J].J. Mol. Biol.. 1991, 220(1): 49-65.
  • 2Pertea M, Lin XY, Salzberg SI.. GeneSplicer:A New Compu-tational Method for Splice Prediction[J].Nucleic Acides Res. , 2001,29(5) : 1185-1190.
  • 3Korf I, Flicek P, Duan D, Brent MR. Integrating Genomic Homology into Gene Structure Prediction [J].Bioinformatics, 2001,17(Suppl. 1) :S140-S148.
  • 4Burge C,Karlin S. Prediction of Complete Gene Structure Inhuman Genomie DNA [J].J. Mol. Biol. , 1997,268 (1):78-94.
  • 5Zhang LR,Luo LF. Splice Site Prediction with Quadratic Discriminant Analysis Using Diversity Measure [J]. Nucleic Acids Res. ,2003,31 (21):6214-6220.
  • 6Eden E,Brunak S. Analysis and Recognition of 5 UTR Intronsplice Sites in Human Pre-m RNA [J].Nucleic Acids Res. ,2004,32(3):1131-1142.
  • 7晏春,杜耀华,高青斌,王正志.基于支持向量机的人类5'非翻译区剪接位点识别[J].生物物理学报,2005,21(4):284-288. 被引量:6
  • 8Jin HY,Luo LF,Zhang LR. Using Estimative Reaction Free Energy to Predict Splice Sites and Their Flanking Competitors [J]. Gene, 2008,424(1-2):115-120.
  • 9Yeo G, Burge CB. Maximum Entropy Modeling of Short Sequence Motifs with Applications to RNA Splicing Signals [J]. J. Comp. Biol. , 2004,11 (2-3) : 377 - 394.
  • 10Saxonov S,Daizadeh I,Gilbert W. An Exhaustive Database of Protein-coding Intron-containing Genes [J].Nucleic Acids Res. ,2000,28(1):185-190.

二级参考文献12

  • 1Reese MG, Kulp D, Tammana H, Haussler D. Genie-gene finding in drosophila melanogaster. Genome Res, 2000,10:529-538.
  • 2Pertea M, Lin XY, Salzberg SL. GeneSplicer: a new computational method for splice prediction. Nucleic Acides Res,2001,29(5):1185-1190.
  • 3Burge C, Karlin S. Prediction of complete gene structure in human genomic DNA. J Mol Biol, 1997,268:78-94.
  • 4Eden E, Brunak S. Analysis and recognition of 5'UTR intron splice sites in human pre-mRNA. Nucleic Acids Res, 2004,32(3): 1131-1142.
  • 5Vapink VN. The nature of statistical learning theory. NY:spfinger-Verlag, 1995.
  • 6Zien A, Ratsch G, Mika S, Schoilkopf B, Lengauer T, Muller KR. Engineering support vector machine kernels that recognize translation initiation sites. Bioinformatics, 2000,16(9):799-807.
  • 7Zhang SW, Pan Q, Zhang HC, Zhang YL, Wang HY. Classification of protein quaternary structure with support vector machine. Bioinformatics, 2003,19(18):2390-2396.
  • 8Wahba G, Lin Y, Zhang H. Margin like quantities and generalized approximate cross validation for support vector machines. IEEE Neural Networks for Signal Processing, 1999,12-20.
  • 9Chapella O, Vapnik V. Model selection of support vector machines. In: Solla S, Leen T, Muller KR, ed. Advances in neural information processing systems. MIT press, 1999,12:230-236.
  • 10Davuluri RV, Suzuki Y, Sugano S, Zhang MQ. CART classification of human 5'UTR sequences. Genome Res, 2000,10: 1807-1816.

共引文献5

同被引文献22

  • 1刘利,李前忠,樊国梁.低维输入空间的支持向量机识别人类剪接位点[J].生物物理学报,2008,24(1):49-56. 被引量:3
  • 2晋宏营,罗辽复,张利绒.核酸-蛋白质结合能在剪切位点识别中的应用[J].生物物理学报,2007,23(3):185-191. 被引量:3
  • 3吕俊杰.真核基因剪接位点识别算法研究.黑龙江:哈尔滨工程大学,2010.
  • 4Staden R, McLachlan AD. Codon preference and its use inidentifying protein coding regions in long DNA sequence.Nucleic Acids Res, 1982,10(1): 141-156.
  • 5Salzberg SL. A method for identifying splice sites andtranslational start sites in eukaryotic mRNA. Comput ApplicBiosci, 1997, 13(4): 365-376.
  • 6Zhang MO, Marr TG. A weight array method for splicingsignal analysis. Comput Applic Biosci, 1993,9(5): 499-509.
  • 7Burge CB. Identification of genes in human genomic DNA.Stanford University, California, 1997.
  • 8Pertea M, Lin X,Salzberg SL. GeneSplicer: A newcomputational method for splice site prediction. NucleicAcids Res, 2001,29(5): 1185-1190.
  • 9Zhang L, Luo L. Splice site prediction with quadraticdiscriminant analysis using diversity measure. Nucleic AcidsRes, 2003, 31(21): 6214-6220.
  • 10Yeo G, Burge CB. Maximum entropy modeling of shortsequence motifs with applications to RNA splicing signals.J Comput Biol, 2004, 11(2-3): 377-394.

引证文献1

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部