期刊文献+

基于位置关联权重矩阵及DNA结构信息预测人类剪接位点 被引量:1

Predicting the Splice Sites of Human Genome Based on Position-correlation Weight Matrix and DNA Structural Parameters
下载PDF
导出
摘要 人类剪接位点的识别是当前研究的一个重要课题.根据人类剪接位点附近区域的保守性,以位置关联权重矩阵及DNA结构信息作为特征输入参数,应用支持向量机(SVM)对人类基因组中的供体端和受体端剪接位点做了预测.对于供体端,5-fold交叉检验总体预测精度为92.55%,3-way data split检验总体预测精度为92.25%;受体端5-fold交叉检验总体预测精度为90.70%,3-way data split检验总体预测精度为89.87%. The human splice site recognition is an important problem.The DNA geometric descriptor and position-correlation weight matrix(PCWM)are introduced to describe the conservative segments around spice sites.And the support vector machine(SVM)models combined with the PCWM scoring function and DNA structural features are developed and used to predict the donor and acceptor spice sites of human genome.For five-fold cross-validation,the total prediction accuracies are 92.55% and 90.70% for donors and acceptors respectively.For 3-way data split,the total accuracies are 92.25% and 89.87% for donors and acceptors,respectively.
出处 《内蒙古大学学报(自然科学版)》 CAS CSCD 北大核心 2010年第4期390-397,共8页 Journal of Inner Mongolia University:Natural Science Edition
基金 内蒙古优秀学科带头人计划资助项目(No.20060702)
关键词 剪接位点 位置关联权重矩阵 DNA结构信息 splice site position-correlation weight matrix(PCWM) DNA structural parameters
  • 相关文献

参考文献26

  • 1林昊,李前忠.拟南芥和线虫外显子/内含子剪切位点的研究[J].内蒙古大学学报(自然科学版),2006,37(3):279-284. 被引量:2
  • 2International Human Genome Sequencing Consortium.Initial sequencing and analysis of the human genome[J].Nature,2001,409 (6822):860-921.
  • 3张利绒,罗辽复,邢永强,晋宏营.人类基因组中可变和组成性剪接位点的预测[J].生物化学与生物物理进展,2008,35(10):1188-1194. 被引量:4
  • 4Kan Z,Rouchka EC,Gish WR,et al.Gene structure prediction and alternative splicing analysis using genomically aligned ESTs[J].Genome Res.,2001,11 (5):889-900.
  • 5Modrek B,Resch A,Grasso C,et al.Genome-wide detection of alternative splicing in expressed sequences of human genes[J].Nucleic Acids Res.,2001,29:2850-2859.
  • 6Modrek B,Lee C.A genomic view of alternative splicing[J].Nature Genet.,2002,30:13-19.
  • 7Bonizzoni P,Rizzi R,Pesole G.ASPIC:a novel method to predict the exon-intron structure of a gene that is optimally compatible to a set of transcript sequences[J].BMC Bioinformatics,2005,6:244.
  • 8Sorek R,Safer HM.A novel algorithm for computational identification of contaminated EST libraries[J].Nucleic Acids Res.,2003,31:1067-1074.
  • 9Sorek R,Shamir R,Ast G.How prevalent is functional alternative splicing in the human genome?[J].Trends Genet,2004,20:68-71.
  • 10杨乌日吐,李前忠,杨科利,林昊.基于序列信息理论预测线虫基因选择性剪切位点[J].内蒙古大学学报(自然科学版),2008,39(1):45-49. 被引量:1

二级参考文献74

  • 1杜耀华,王正志,倪青山,李冬冬.一种基于特征筛选的原核生物启动子判别分析方法[J].生物物理学报,2006,22(1):39-48. 被引量:6
  • 2杨科利,李前忠,林昊.预测酵母(Yeast)基因转录因子结合位点[J].内蒙古大学学报(自然科学版),2006,37(5):524-530. 被引量:16
  • 3杨乌日吐,李前忠,刘利,樊国梁.用支持向量机预测人类基因5′/3′选择性剪切位点[J].现代生物医学进展,2007,7(5):790-792. 被引量:2
  • 4Akan P,Deloukas P.DNA sequence and structural properties as predictors of human and mouse promoters.Gene,2008,410 (1):165~176
  • 5Allison L A,Moyle M,Shales M,et al.Extensive homology among the largest subunits of eukaryotic and prokaryotic RNA polymerases.Cell,1985,42(2):599~610
  • 6Rombauts S,Florquin K,Lescot M,et al.Computernal approches to identify promoters and cis-regulatory elements in plant ganomes.Plant Physiology,2003,133:1162~1176
  • 7Shahmuradov I A,Gammerman A J,Hancock J M,et al.PlantProm:a database of plant promoter sequence.Nucleic Acids Res,2003,31 (1):114~117
  • 8Gardiner-Garden M,Frommer M.CpG islands in vertebrate genomes.J Mol Biol,1987,196(2):261~282
  • 9Bajic V B,Tan S L,Sutuki Y T,et al.Promoter prediction analysis on the whole human genome.Nature Biotechnology,2004,22(11):1467~1473
  • 10Tatarinova T,Brover V,Troukhan M,et al.Skew in CG content near the transcription start site in Arabidopsis thaliana.Bioinformatics,2003,19(Suppl 1):i313~i314

共引文献8

同被引文献22

  • 1刘利,李前忠,樊国梁.低维输入空间的支持向量机识别人类剪接位点[J].生物物理学报,2008,24(1):49-56. 被引量:3
  • 2晋宏营,罗辽复,张利绒.核酸-蛋白质结合能在剪切位点识别中的应用[J].生物物理学报,2007,23(3):185-191. 被引量:3
  • 3吕俊杰.真核基因剪接位点识别算法研究.黑龙江:哈尔滨工程大学,2010.
  • 4Staden R, McLachlan AD. Codon preference and its use inidentifying protein coding regions in long DNA sequence.Nucleic Acids Res, 1982,10(1): 141-156.
  • 5Salzberg SL. A method for identifying splice sites andtranslational start sites in eukaryotic mRNA. Comput ApplicBiosci, 1997, 13(4): 365-376.
  • 6Zhang MO, Marr TG. A weight array method for splicingsignal analysis. Comput Applic Biosci, 1993,9(5): 499-509.
  • 7Burge CB. Identification of genes in human genomic DNA.Stanford University, California, 1997.
  • 8Pertea M, Lin X,Salzberg SL. GeneSplicer: A newcomputational method for splice site prediction. NucleicAcids Res, 2001,29(5): 1185-1190.
  • 9Zhang L, Luo L. Splice site prediction with quadraticdiscriminant analysis using diversity measure. Nucleic AcidsRes, 2003, 31(21): 6214-6220.
  • 10Yeo G, Burge CB. Maximum entropy modeling of shortsequence motifs with applications to RNA splicing signals.J Comput Biol, 2004, 11(2-3): 377-394.

引证文献1

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部