期刊文献+

生物序列模体的混合Gibbs抽样识别算法 被引量:2

Multiple Motif Discovery in Biological Sequences by Mixture Gibbs Sampling
下载PDF
导出
摘要 针对生物序列模体的识别问题,提出了一个新的混合Gibbs抽样识别算法.算法基于混合模体模型学习,采用贪心策略,通过似然度最大化,逐次将新的模体加入到混合模型中.算法中设计了位点抽样和模体抽样两种抽样方法,这两种抽样方法交替进行.为了加速搜索过程,对输入数据集采用了基于kd-trees的分层划分策略.实验结果表明,该算法对序列家族大量模体特征的识别具有显著优势,并且可建立更具统计特征的模体模型,从而提高序列分类的准确性. For the motif discovery problem of biological sequences, a mixture Gibbs sampling algorithm is presented. Based on mixture motifs model learning through likelihood maximization, a greedy strategy that adds sequentially new motif to a mixture model is employed, Two sampling methods are designed, site sampling and motif sampling, the two sampling methods are applied by turns. In order to speed up the searching procedure, a hierarchical partitioning scheme based on kd-trees is used for partitioning the input dataset. Experimental results indicate that the proposed algorithm is adyantageous in identifying larger groups of motifs characteristic of biological families. In addition, it offers better diagnostic capabilities by building more powerful statistical motif models with improved classification accuracy.
出处 《电子学报》 EI CAS CSCD 北大核心 2008年第4期750-755,共6页 Acta Electronica Sinica
基金 国家自然科学基金(No.60705004) 陕西省自然科学基金(No.2005F33)
关键词 生物信息学 模体识别 GIBBS抽样 混合模体模型 bioinformatics motif discovery Gibbs sampling mixture motifs model
  • 相关文献

参考文献10

  • 1Hertz G,Stormo G.Identifying DNA and protein patterns with statistically significant alignments of multiple sequences[J].Bioinfomatics,1999,15(7-8):563-577.
  • 2Lawrence C E,Altschul S F,Bogouski M S,Liu J S,Neuwald A F,Wooten J C.Detecting subtle sequence signals:a Gibbs sampling strategy for multiple alignment[J].Science,1993,262(5131):208-214.
  • 3Neuwald A F,Liu J S,Lawrence C E.Gibbs motif sampling:detection of bacterial outer membrane repeats[J].Protein Science,1995,4(8):1618-1632.
  • 4Liu J S,Neuwald A F,Lawrence C E.Bayesian models for multiple local sequence alignment and Gibbs sampling strategies[J].Journal of the American Statistical Association,1995,90(432):1156-1170.
  • 5W Thompson,E C Rouchka,C E Lawrence.Gibbs recursive sampler:finding transcrition factor binding sites[J].Nucleic Acids Research,2003,31(13):3580-3585.
  • 6Timothy L Bailey,Charles Elkan.Fitting a mixture model by expectation maximization to discover motifs in biopolymers[A].Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology[C].Menlo Pink,Califorma:AAAI Press,1994.28-36
  • 7K Blekas,D Fotiadis,A Likas.Greedy mixture learning for multiple motif discovery in biological sequences[J].Bioinformatics,2003,19(5):607-617.
  • 8Vlassis N,Likas A.A greedy EM algorithm for Gaussian mixture learning[J].Neural Processing tetters,2002,15(1):77-87.
  • 9Timothy L Bailey,Michael Gribskov.Combining evidence using p-values:application to sequence homology searches[J].Bioinformatics,1998,14(1):48-54.
  • 10Gribskov M,Robinson N L.Use of receiver operating characteristic (ROC) analysis to evaluate sequence matching[J].Computational Chemistry,1996,20(1):25-33.

同被引文献30

  • 1左国玉,刘文举,阮晓钢.声音转换技术的研究与进展[J].电子学报,2004,32(7):1165-1172. 被引量:32
  • 2熊刚,赵惠昌,王李军.海杂波背景下雷达引信的相关检测方法研究[J].电子学报,2004,32(12):1937-1940. 被引量:5
  • 3McLachlan G, Peel D. Finite Mixture Models[ M]. New York: John Wiley Sons,2000.
  • 4W K Hastings. Monto Carlo sampling methods using Markov chains and their Applications [ J]. Biometrika, 1970, 57 (1):97 - 109.
  • 5A P Dempster NML,D B Rubin.Maximum likelihood from Incomplete Data via the EM algorithm[ J ]. Journal of the Royal statistical Society, Series B, 1977,39( 1 ) : 1 - 28.
  • 6Constantinos Constantinopoulos, Michalis K. Titsias, and Aristidis Likas, Bayesian Feature and Model Selection for Gaussian Mixture Models[ J] .WEE Transactions of Pattern Analysis and Machine Intelligence, 2006,6 (28) : 1013 - 1018.
  • 7Nizar Bouguila, Djemel Ziou. A Hybrid Sem Algorithm for High-Dimensional Unsupervised Learning Using a Finite Generalized Dirichlet Mixture [ J ]. IEEE Transactions on Image Processing, 2006,15 ( 9 ) : 2657 - 2668.
  • 8Mario A T Figueiredo, Anil K. Jain. Unsupervised Learning of Finite Mixture Models [ J]. IEEE Transactions of Pattern Analysis and Machine Intelligence,2002,3(24) :381 - 396.
  • 9Bouguila N, Ziou D. High-dimensional unsupervised selection and estimation of a finite generalized dirichlet mixture model based on minimum message length [ J]. IEEE Transactions on Pattem Analysis and Machine Intelligence, 2007,29(10) : 1716 - 1731.
  • 10Pemkopf F, Bouchaffra D. Genetic-based EM algorithm for learning Gaussian mixture models [ J ]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005,27 (8) : 1344 - 1348.

引证文献2

二级引证文献25

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部