A Modified Statistically Optimal Null Filter Method for Recognizing Protein-coding Regions 被引量：1

A Modified Statistically Optimal Null Filter Method for Recognizing Protein-coding Regions

导出

摘要 Computer-aided protein-coding gene prediction in uncharacterized genomic DNA sequences is one of the most important issues of bio- logical signal processing. A modified filter method based on a statistically optimal null filter （SONF） theory is proposed for recognizing protein-coding regions. The square deviation gain （SDG） between the input and output of the model is used to identify the coding regions. The effective SDG amplification model with Class I and Class II enhancement is designed to suppress the non-coding regions. Also, an evaluation algorithm has been used to compare the modified model with most gene prediction methods currently available in terms of sensitivity, specificity and precision. The performance for identification of protein-coding regions has been evaluated at the nucleotide level using benchmark datasets and 91.4%, 96%, 93.7% were obtained for sensitivity, specificity and precision, respectively. These results suggest that the proposed model is potentially useful in gene finding field, which can help recognize protein-coding regions with higher precision and speed than present algorithms. Computer-aided protein-coding gene prediction in uncharacterized genomic DNA sequences is one of the most important issues of bio- logical signal processing. A modified filter method based on a statistically optimal null filter （SONF） theory is proposed for recognizing protein-coding regions. The square deviation gain （SDG） between the input and output of the model is used to identify the coding regions. The effective SDG amplification model with Class I and Class II enhancement is designed to suppress the non-coding regions. Also, an evaluation algorithm has been used to compare the modified model with most gene prediction methods currently available in terms of sensitivity, specificity and precision. The performance for identification of protein-coding regions has been evaluated at the nucleotide level using benchmark datasets and 91.4%, 96%, 93.7% were obtained for sensitivity, specificity and precision, respectively. These results suggest that the proposed model is potentially useful in gene finding field, which can help recognize protein-coding regions with higher precision and speed than present algorithms.

作者 Lei Zhang Fengchun Tian Shiyuan Wang

机构地区 College of Communication Engineering School of Electronic and Information Engineering

出处《Genomics, Proteomics & Bioinformatics》 CAS CSCD 2012年第3期166-173,共8页 基因组蛋白质组与生物信息学报（英文版）

基金 supported by the Fundamental Research Funds for the Central Universities (Grant No.CDJXS10160001) the Central University Postgradu-ate’ Science and Innovation Funds of China (Grant No.CDJXS12160005)

关键词 Gene prediction Biological signal processing Protein-coding region Square deviation gain Gene prediction Biological signal processing Protein-coding region Square deviation gain

分类号 Q78 [生物学—分子生物学] TN713 [电子电信—电路与系统]

引文网络
相关文献

参考文献1

1Sitanshu Sekhar Sahu,Ganapati Panda.Identification of Protein-Coding Regions in DNA Sequences Using A Time-Frequency Filtering Approach[J].Genomics, Proteomics & Bioinformatics,2011,9(1):45-55. 被引量：4

二级参考文献35

1Fickett,J.W.and Tung,C.S.1992.Assessment of protein coding measures.Nucleic Acids Res.20:6441-6450.
2Fickett,J.W.1996.The gene identification problem:an overview for developers.Comput.Chem.20:103-118.
3Vaidyanathan,P.P.and Yoon,B.J.2004.The role of signal-processing concepts in genomics and proteomics.J.Franklin Inst.341:111-135.
4Tiwari,S.,et al.1997.Prediction of probable genes by Fourier analysis of genomic sequences.Comput.Appl.Biosci.13:263-270.
5Tsonis,A.A.,et al.1991.Periodicity in DNA coding sequences:implications in gene evolution.J.Theor.Biol.151:323-331.
6Gutierrez,G.,et al.1994.On the origin of the periodicity of three in protein coding DNA sequences.J.Theor.Biol.167:413-414.
7Bernaola-Galvan,P.,et al.2000.Finding borders between coding and noncoding DNA regions by an entropic segmentation method.Phy.Rev.Lett.85:1342-1345.
8Voss,R.F.1992.Evolution of long-range fractal correlations and 1/f noise in DNA base sequences.Phys.Rev.Lett.68:3805-3808.
9Chatzidimitriou-Dreismann,C.A.and Larhammar,D.1993.Long-range correlations in DNA.Nature 361:212-213.
10Henderson,J.,et al.1997.Finding genes in DNA with a Hidden Markov Model.J.Comput.Biol.4:127-141.

共引文献3

1马玉韬,张成,杨泽林,李琦,杨婷.Effects of Mapping Methods on Accuracy of Protein Coding Regions Prediction[J].Agricultural Science & Technology,2011,12(12):1802-1806.
2马玉韬,张成,杨泽林,李琦,杨婷.DNA序列映射方法对蛋白质编码区预测准确率的影响[J].安徽农业科学,2012,40(6):3234-3238. 被引量：7
3马玉韬,轩秀巍,车进,滕建辅.基于全相位滤波理论的基因预测[J].上海交通大学学报,2013,47(7):1149-1154. 被引量：2

同被引文献1

1Christina S Mullins,Michael Linnebacher.Human endogenous retroviruses and cancer:Causality and therapeutic possibilities[J].World Journal of Gastroenterology,2012,18(42):6027-6035. 被引量：4

引证文献1

1韩九强,吕红强,刘俊,张善新.基于生物信息学的HERV研究现状与发展趋势[J].生物信息学,2014,12(2):117-122. 被引量：1

二级引证文献1

1张宗彦,张雯,刘妙龄,贲亚琍.重组质粒pGEM-HERV-K gag的构建与鉴定[J].西南医科大学学报,2018,41(5):398-400.

1Sitanshu Sekhar Sahu,Ganapati Panda.Identification of Protein-Coding Regions in DNA Sequences Using A Time-Frequency Filtering Approach[J].Genomics, Proteomics & Bioinformatics,2011,9(1):45-55. 被引量：4
2马玉韬,轩秀巍,车进,滕建辅.基于全相位滤波理论的基因预测[J].上海交通大学学报,2013,47(7):1149-1154. 被引量：2
3马宝山,朱义胜.用于基因预测的自适应滤波器的仿真研究[J].系统仿真学报,2007,19(24):5620-5623. 被引量：4
4杜立新,郝峰.动物遗传育种电脑辅助教学系统的研制(上)[J].遗传,1993,15(4):18-22. 被引量：2
5马宝山,朱义胜.一种用于基因预测的FIR数字滤波器[J].电子学报,2007,35(9):1710-1713. 被引量：8
6MALAYA KUMAR HOTA,VINAY KUMAR SRIVASTAVA.MULTISTAGE FILTERS FOR IDENTIFICATION OF EUKARYOTIC PROTEIN CODING REGIONS[J].International Journal of Biomathematics,2012,5(2):43-60. 被引量：1
7甘顺发.浅析影响生态平衡的因素[J].学苑教育,2015,0(16):74-74.
8王萍.亚麻木酚素生理功能[J].粮食与油脂,2000(4):45-46. 被引量：14
9官章健,王雪梅,陈昊明,沈方泉.基于矢量夹角的改进型编码基因落点定位[J].探测与控制学报,2013,35(1):55-58. 被引量：2
10尹充,王莹,林文轩,王寻.D2D-MIMO系统中基于下行预编码的干扰抑制策略[J].电子与信息学报,2014,36(10):2314-2319. 被引量：4

Genomics, Proteomics & Bioinformatics

2012年第3期

浏览历史

内容加载中请稍等...