期刊文献+

基于加权贝叶斯分类器的人类启动子辨识方法 被引量:1

Human promoter recognition based on weighted Bayesian classifier
下载PDF
导出
摘要 基因启动子区域控制一个基因转录的起始。因此,真核启动子预测是DNA序列分析中最重要的问题,也是非常困难的任务。用高斯混合模型(GMM)估计启动子中寡核苷酸位置密度并将其作为特征向量,是一种有效的方法。然而混合度G通常都选的很大,模型训练需要大量的时间。由于每个寡核苷酸位置分布的不同,本文提出用模糊聚类的方法分别确定每个寡核苷酸的最优混合度,提高了寡核苷酸位置分布的检测精度,并减少了计算时间。接着,提出了一种基于最小二乘法的加权贝叶斯分类器算法,用于人类启动子的辨识,进一步提高了辨识精度。仿真结果表明,本算法具有较高的预测效果。 The gene promoter region controls transcription of a gene.The prediction of the eukaryotic promoter is the most important problem in DNA sequence analysis,also a very difficult task.Applying a Gaussian Mixture Model(GMM) to calculate the positional densities of oligonucleotides in promoter sequence which taken as feature vector is an effective method.But the number of mixtures of GMM is usually very large,so training the model needs a lot of time.Since the positional densities of every oligonucleotides is different,in this paper,the fuzzy cluster is used to determine the optimal numbers of GMM components so as to improve the precision of detection and reduce the computational time.Then,a weighted na?ve Bayes classifier based on the Least Square is proposed and applied to the true promoter prediction.The simulation results show the efficiency of the proposed approach.
作者 郭烁 朱义胜
出处 《电路与系统学报》 CSCD 北大核心 2010年第4期33-37,共5页 Journal of Circuits and Systems
基金 国家自然科学基金资助项目(50877004)
关键词 启动子 寡核苷酸 模糊聚类 高斯混合模型 最小二乘法 加权贝叶斯分类器 promoter oligonucleotide fuzzy cluster Gaussian Mixture Model(GMM) Least Square weighted Bayesian classifier(WNB)
  • 相关文献

参考文献19

  • 1Down TA, Hubbard TJP. Computational detection and location of transcription start sites in mammalian genomic DNA [J]. Genome Res., 2002, 12: 458-61.
  • 2Hutchinson GB. The prediction of vertebrate promoter regions using differential hexamer frequency analysis [J]. Comp Appl Biosci, 1996, 12: 391-8.
  • 3Chen QK, Hertz GZ, Stormo GD. PromFD 1.0: a computer program that predicts eukaryotic pol II promoters using strings and IMD matrices [J]. Comp Appl Biosci, 1997, 13: 29-35.
  • 4Rani T S, Bapi R S. Cascaded Multi-level Promoter Recognition of E. coli Using Dinucleotide Features [C]. ICIT '08. International Conference on Information Technology [C]. Rome Italy. Digital Object Identifier, 2008.83-88.
  • 5Scherf M, Klingenhoff A, Werner T. Highly specific localization of promoter regions in large genomic sequences by Promoter Inspector: a novel context analysis approach [J]. J Mol Biol, 2000, 297: 599-606.
  • 6Qian-Zhong Li, Hao Lin. The recognition and prediction of σ^70 promoters in Escherichia coli K-12 [J]. Journal of Theoretical Biology, 2006, 242(1): 135-141.
  • 7Bajic VB, Seah SH, Chong A, Krishnan SPT, Koh JLY, Brusic V. Computer model for recognition of functional transcription start sites in RNA Polymerase II promoters of vertebrates [J]. J Mol Graph Model, 2003, 21: 323-32.
  • 8Hannenhalli S, Levy S. Promoter prediction in the human genome [J]. Bioinformatics, 2001, 17(Suppl 1): 90-6.
  • 9Davuluri RV, Grosse I, Zhang MQ. Computational identification of promoters and first exons in the human genome [J]. Nat Genet, 2001, 29: 412-7.
  • 10Bajic VB, Seah SH. Dragon gene start finder identifies approximate locations of the 50 ends of genes [J]. Nucleic Acids Res, 2003, 31: 3560-3.

二级参考文献12

  • 1睢刚,陈来九.动态系统模糊模型辨识及其自学习算法[J].自动化学报,1995,21(6):749-753. 被引量:5
  • 2尚修刚,蒋慰孙.一种新的模糊似然函数[J].模式识别与人工智能,1997,10(1):9-14. 被引量:8
  • 3廖俊,朱世强,林建亚,任德祥.遗传算法在T-S模糊模型辨识中的应用[J].信息与控制,1997,26(2):140-145. 被引量:11
  • 4[1]Friedman N. Bayesian Network Classifiers. Machine Learning, 1997,29:131~163
  • 5[2]Duda R O, Hart P E- Pattern Classification and Scence Analysis, New York: John Wiley & Sons, 1973
  • 6[3]Langley P, et al. An analysis of Bayesian classifiers. In: Proc. Of the National Conf. On Artificial Intelligence (AAAI' 92). Menlo Park, CA: AAAI Press, 1992. 223~228
  • 7[4]Chow C K, Liu C N. Approximating discrete probability distributions with dependence tree. IEEE Trans. On Information Theory, 1968,14: 462~467
  • 8[5]Pearl J. Probabilistic Reasoning in Intelligent Systems. San Francisco ,CA: Morgan Kaufmann, 1988. 387~390
  • 9[6]Elkan C. Boosting and naive Bayesian learning : [Technical Report No. CS97-557]. Department of Computer Science & Engineering, Univ. Of California, 1997
  • 10张化光,复杂系统的模糊辨识与模糊自适应控制,1993年

共引文献44

同被引文献20

  • 1Fickett J W, Hatzigeorgiou A G. Eukaryotic promoter recognition [J]. Genome. Res., 1997, 7 (9): 861-78.
  • 2Hutchn G B. The prediction of vertebrate promoter regions using differential hexamer frequency analysis [J]. Comp. Appl. Biosci., 1996, 12:391- 398.
  • 3Chen Q K, Hertz G Z, Stormo G D. PromFD 1.0: a computer program that predicts eukaryotic pol ll promoters using strings and IMD matrices [J]. Comp. Appl. Biosci., 1997, 13 (1): 29-35.
  • 4Scherf M, Klingenhoff A, Werner T. Highly specific localization of promoter regions in large genomic sequences by promoter inspector: a novel context analysis approach [J]. J. Mol. Biol., 2000, 297 (3): 599-606.
  • 5Down T A, Hubbard T J. Computational detection and location of transcription start sites in mammalian genomic DNA [J]. GenomeRes., 2002, 12 (3): 458-461.
  • 6Vladimir B Bajic, Seng Hong Seah. Dragon gene start finder identifies approximate locations of the 5 ends of genes [J]. Nucleic Acids Research, 2003, 31 (13) : 3560-3563.
  • 7Hannenhalli S, Levy S. Promoter prediction in the human genome[J]. Bioinformatics, 2001, 17 (1): 90.
  • 8Davuluri R V, Grosse I, Zhang M Q. Computational identification of promoters and first exons in the human genome [J]. Nature Genetics , 2001, 29 (4): 412.
  • 9Michael Towsey, Peter Timms, James Hogan, Sarah A Mathews. The cross species prediction of bacterial promoters using a support vector machine[J]. Computational Biology and Chemistry, 2008, 32 (5) : 359-366.
  • 10Ulf Schaefer, Rimantas Kodzius, Chikatoshi Kai, Jun Kawai, Piero Carninci, Yoshihide Hayashizaki, Vladimir B BaSic. High sensitivity TSS prediction: estimates of locations where TSS cannot occur[J]. PLoSONE, 2010, 5 (11): e13934.

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部