期刊文献+

基于统计组合与CpG含量分类的基因预测算法 被引量:2

A gene-prediction algorithm based on the statistical combination and the classification in terms of CpG content
下载PDF
导出
摘要 基因预测是研究基因功能、基因表达、基因之间的关联关系以及如何控制基因转录等工作的基础。现有的基因预测方法在预测内部编码外显子方面能达到较高的精度,但在预测5′端外显子方面却存在着不足。本文针对5′端外显子,构建了一种基于统计组合与CpG含量分类的基因预测算法:将基因区域数据根据CpG含量分成2个相对独立的集合,采用统计组合的方法将多种基因预测方法综合在一起进行基因预测研究。实验结果表明该算法提高了基因预测的精度,为进一步研究基因预测提供了一种可能的方案。 Gene-prediction is the foundation of researches on the gene function, the gene expression, the corelationship among genes, and the way of controlling the gene transcription. At present, gene-prediction methods have achieved superior precision on predicting internal coding exons, however, they are inefficient on predicting 5'-exons. Focusing on predicting 5' -exons. A gene-prediction algorithm based on the statistical combination and the classification in terms of CpG content is shown in this paper, in which the data in the gene areas are divided into two relatively independent groups in terms of the CpG content, and then a statistical combination of various methods of gene-prediction is applied to the research. In the experiment, the precision of the gene-prediction is improved by using this algorithm and a possible way for the future research is also provided.
出处 《北京生物医学工程》 2007年第2期178-181,190,共5页 Beijing Biomedical Engineering
基金 中国科学技术大学高水平大学建设重点项目资助
关键词 基因预测 CpG含量 线性组合 统计组合 gene-prediction CpG content linear combination statistical combination
  • 相关文献

参考文献16

  • 1Wang L,Jiang T.On the Complexity of Multiple Sequence Alignment.J Comp Biol,1994,1 (4):337-348
  • 2Lipman DJ.Altschul SF.Kececioglu J.A Tool for Multiple Sequence Alignment.Proc Natl Acad Sci USA,1989,86:4412-4415
  • 3David W Mount.Bioinformatics:Sequence and Genome Analysis.USA:Cold Spring Harbor Laboratory Press,2001
  • 4Zhang MQ.Identification of protein coding regions in the human genome by quadratic discriminant analysis.Proc Natl Acad Sci USA,1997,94:565-568
  • 5Solovyev VV,Salamov AA,Lawrence CB.Predicting internal exons by oligonucleotide composition and discriminant analysis of spliceable open reading frames.Nucleic Acids Res,1994,22:5156 -5163
  • 6Burge C,Karlin S.Prediction of complete gene structure in human genomic DNA.J Mol Biol,1997,268:78-94
  • 7Lukashin AV,Bordovsky M GeneMark.hmm:New solutions for gene finding.Nucleic Acids Res,1998,26:1107-1115
  • 8Uberbacher EC,Mural RJ.Locating protein coding segments in human DNA sequences by a multiple sensor-neural network approach.Proc Natl Acad Sci USA,1991,88:11261 -11265
  • 9Katokhin AV,Levitsky VG.Recognition of eukaryotic promoters using a genetic algorithm based on iterative discriminant analysis.In Silico Biology,2003,3:81 -87
  • 10Pertea M,Salzberg SL.Computational gene finding in plants.Plant Mol Biol,2002,48:39 -48

同被引文献15

  • 1王玉.利用小波变换的一种快速预测基因方法[J].生物数学学报,2008,23(2):364-370. 被引量:3
  • 2冯冲,陈肇雄,黄河燕,张亮,王江伟.基于条件随机域的复杂最长名词短语识别[J].小型微型计算机系统,2006,27(6):1134-1139. 被引量:16
  • 3史庆伟,赵政,鲍虎.基于条件随机域的Web信息抽取[J].辽宁工程技术大学学报(自然科学版),2007,26(4):570-572. 被引量:2
  • 4Roman-Roldan R,Bernaola-Galvan P,Oliver J L.Analysis of symbolic sequences using the Jensen-Shannon divergence[J].Physical Review,2002,65(4):1-16.
  • 5Durbin S,et al.Biological Sequence Analysis:Probabilistic Models of Proteins and Nucleic Acids[M]. London:Cambridge University Press,1998.
  • 6Gribskov M.McLachlan AD.Eisenberg D.Profile analysis:Detection of distantly related proteins[J].Biochemistry, 1987,84(1):4355-4358.
  • 7Hargbo J,Elofsson A.Function and genetics[J].Proteins Structure,1999,36(1):68 -76.
  • 8Wen-tian Li.New Stopping Criteria for Segmenting DNA Sequences[J].Physics Review.Letter,2001, 8(6):15-19.
  • 9Zhang Jing-xiang,Xu zhen-yuan.Study on the methods for finding borders between coding and noncoding DNA regions[J].IEEE.Wuhan,2007,1(4):1012-1015.
  • 10石鸥燕,杨晶,田心.基于MATLAB的隐马尔可夫模型识别CpG岛[J].计算机应用与软件,2008,25(11):214-215. 被引量:3

引证文献2

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部