期刊文献+

1种蛋白质Loop片段结构的概率生成模型

A generative probabilistic model for Loop modeling
原文传递
导出
摘要 在计算生物学中,根据蛋白质的氨基酸序列预测蛋白质的结构是尚未解决的重要问题之一,而其中的1个难点是预测蛋白质中Loop片段的结构。本文用1阶马尔可夫模型为基础,通过对其训练,可根据氨基酸串和2级结构信息为蛋白质Loop片段概率建模和采样。其中用Ramachandran图示法的二面角对描述蛋白质结构,模型的训练和推理通过工具包Mocapy来完成。并使用KL交叉熵和角度差异值作为实验检验标准来完成Loop分布情况的测试实验,同时在从头预测Loop结构实验中预测CASP8中8个自由建模的蛋白质结构。与最流行的方法相比,本文提出的模型因为改进了Loop段的预测精度,从而可使得到的二面角对更加接近真实Loop结构中分布,同时在从头预测中提高整个蛋白质结构的预测精度。并且由于本文的模型具有概率推理特性,故在理论上也更具有无偏见性。 Predicting the three-dimensional structure of a protein given its amino acid sequence remains one of the greatest challenges in computational biology, and the Loop structure prediction is a difficulty to complete this challenge. Based on the first-order Markov model this paper presents a probabilistic model of Loop protein structure. And after the model is trained, can be sampled the dihedral angle pairs represented by real values when given the amino acid sequence and second structure information. The dihedral angle pairs are used by Ramachandran to describe the protein's structure. And the model was trained by using the Mocapy DBN toolkit. In order to evaluate model's performance, 8 of free modeling targets of CASP8 are chose for the experimentation. And we use KL divergence and angular deviation as the criterion of experimentation. Compared with the state-of-art programs of protein structure prediction, the model enhances the Loop structure prediction accuracy and helps to improve the full protein backbone accuracy. Hence the model is a generative probabilistic model, it is more reasonable in theory.
出处 《计算机与应用化学》 CAS CSCD 北大核心 2010年第5期573-576,共4页 Computers and Applied Chemistry
基金 国家自然科学基金项目(60970055)
关键词 蛋白质Loop 1阶马尔可夫概率生成模型 双变量yon Mises分布 protein Loop, first-order Markov model, bivariate von Mises distribution
  • 相关文献

参考文献15

  • 1Rohl C A, Charlie E M Struss, Kira MS Misura and David Barker. Protein structure prediction using rosetta. Methods in Enzymology, 2004, 383:66-93.
  • 2Philip Bradley, Kira MS Misura, David Baker. Toward high-resloution de novo structure prediction for small proteins. Science, 2005, 309(5742): 1868-1871.
  • 3Wouter Boomsma, Mardia KV. Taylor CC, Jesper Ferkinghoff-Borg, Anders Krogh, and Thomas Hamelryck. A generative, probabilistic model of local protein structure. PNAS, 2008, 105: 8932-8937.
  • 4Ramachandran G N, Ramakrishnan C and Sasisekharan V. Stereochemistry of polypeptide chain configurations. J Mol Biol, 1963, 7: 95-99.
  • 5Mardia K V, Taylor C C and Subramaniam G K. Protein bioinformatics and mixtures of bivariate von mises distributions for angular data. Biometrics, 2007, 63:505-512.
  • 6Van Walle I, Lasters I and Sabmark WL. A benchmark for sequence alignment that covers the entire known fold space. Bioinformatics, 2005, 21 : 1267-1268.
  • 7Murzin A G, Brenner S E, Hubbard T, Chothia C. SCOP: a structural classification of proteins database for the investigation of sequences and structures. J Mol Biol, 1995, 247: 536-540.
  • 8BackboneDBN[http://source forge.net/proj ects/phaistos/].
  • 9Hamelryck T. Mocapy: A parallelized toolkit for learning and inference in dynamic bayesian networks. Copenhagen, Univ of Copenhagen, 2007.
  • 10CASP8 [http://predictioncenter.gc.ucdavis.edu/casp8/results.cgi].

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部