期刊文献+

AP聚类算法求解植入(l,d)模体识别问题

AP Clustering Algorithm Solving Planted( L,d) Motif Identification
下载PDF
导出
摘要 模体识别是运用计算机算法寻找一系列功能相近且形式相似的DNA序列片段,从而找出生物信息学中控制基因表达调控机制的转录因子结合位点,将这种问题转化为AP聚类算法可处理的模型,然后用AP聚类得到稳定的候选模体聚类,最终利用贪心算法对问题进行求精,得出一组候选模体集,利用相对熵测度对候选模体集合进行评价并且择优输出,从而构造出一种新的模体识别算法.实验结果分别从模拟数据和真实数据证明了所提算法的有效性. Transcription factors can be combined with scription process. The special DNA sequence is called the special DNA sequence that can control gene tranthe motifs. The motif identification is to find a set of DNA fragments with both similar functions and similar forms. It plays a crucial role in the research on the structure and function of genes. The problem was converted to the model which can be processed by AP clustering algorithm. Then we get steady candidate motifs by using AP clustering. Finally we use the greedy algorithm to refine the clustering results. We can get a group of candidate motifs set, evaluate candidate motifs wet by information content and output the optimal motif set. Thereby the new algorithm is designed for the problem. The experimental results on both simulated data and real data demonstrate the validity of the proposed algorithm.
作者 陈昆 张小骏
出处 《郑州大学学报(工学版)》 CAS 北大核心 2015年第3期110-114,共5页 Journal of Zhengzhou University(Engineering Science)
基金 中央高校基本科研项目(K50513100011)
关键词 基因转录 模体识别 AP聚类算法 gene transcription motif identification AP clustering algorithm
  • 引文网络
  • 相关文献

参考文献11

  • 1HAESELEER P D. How does DNA sequence motif dis- covery work [ J ]. Nature Biotechnology, 2006, 24 (8) ,68 -74.
  • 2ZAMBELLI F, PESOLE G. Motif discovery and tran- scription factor binding sites before and after the next- generation sequencing era [ J]. Briefings in Bioinfor- matics, 2013, 14(2):225-237.
  • 3PEVZNER P A, SZE S H. Combinatorial approaches to finding subtle signals in DNA sequences [ C ]//Pro- ceedings of the Eighth International Conference on In- telligent Systems for Molecular Biology. California: Spring-verlag ,2000 : 269 - 278.
  • 4BUHLER J, TOMPA M. Finding motifs using random projections [ J ]. Journal of Computational Biology, 2002,9(2) : 225 -242.
  • 5DAVILA J, BALLA S, RAJASEKARAN S. Space and time efficient algorithms for planted motif search [ C ]//Proceedings of the Second International Work- shop on Bioinformatics Research and Applications. [ s. 1. ] :CRC Press Inc, UK. 2006 : 822 - 829.
  • 6BAILEY T L, ELKAN C. Fitting a mixture model by expectation maximization to discover motifs in biopoly- mers [ C ]//Proceedings of the 2nd International Con- ference on Intelligent Systems for Molecular Biology. Berlin : Spring-verlag, 1994:28 - 36.
  • 7YU Qiang, HUO Hong-wei, ZHANG Yi, et al. Pair- Motif + : a fast and effective algorithm for De Novo motif discovery in DNA sequences [ J]. International Journal of Biological Sciences, 2013, 9 (4): 412 - 424.
  • 8LAWRENCE C E, ALTSCHUL S F, BOGUSKI M S, et al. Detecting subtle sequence signals: a Gibb' s sampling strategy for multiple alignment [ J ]. Science, 1993, 262 : 208 - 214.
  • 9SBRENDAN J, FERY, DELBERT D. Clustering by passing messages between data points [ J]. SCIENCE, 2007, 2(3) :315 -328.
  • 10GIULIO P, GIANCARLO M, GRAZIANO P. An algo- rithm for finding signals of unknown lengthinDNAse- quences [ J ]. Bioinformatics,2001,4 ( 3 ) :207 - 214.
;
使用帮助 返回顶部