期刊文献+

无比对的生物分子序列比较方法 被引量:3

Alignment-free Biomolecular Sequence Comparison Method
下载PDF
导出
摘要 生物序列分析是生物信息学的主要研究领域,常常通过比较分析获取有用的信息。最常用的比较方法是序列比对,但是利用比对的序列比较假设了同源片断之间是邻接保守的,这和遗传重组相冲突,而且多序列比对在计算复杂性等方面存在困难,这些使得人们努力研究无比对的序列比较方法。本文综述了目前无比对序列比较的两类主要方法:一类基于字(低聚物)的出现率及其分布,通过出现率向量定义的笛卡尔空间中的距离计算来实现序列比较;另一类使用柯尔莫哥洛夫复杂度理论或混沌理论来实现序列比较。 Biosequence analysis is the primary research field of bioinformatics. In this field, useful information can be extracted by comparison analysis methods. Among them, sequence alignment is the most common comparison method. However the sequence comparison by alignment, which assumes conservation of contiguity between homologous segments, is at odds with genetic recombination. Especially for the multisequence alignment, there exists the difficulty in the complexity of calculation. Therefore, alignment-free sequence comparison methods are required. In this paper, two main categories of alignment-free sequence comparison methods are reviewed. The first one is based on the word (oligomer) frequency and its distribution. The sequences are compared using the distances defined in a Cartesian space by the frequency vectors. In the second category, sequences are compared using Kolmogorov complexity and chaos theory.
出处 《生物医学工程学杂志》 EI CAS CSCD 北大核心 2005年第3期598-601,605,共5页 Journal of Biomedical Engineering
关键词 生物分子序列 生物信息学 频率向量 G-蛋白 氨基酸 Bioinformatics Sequence alignment Alignment-free sequence comparison Frequency vector Distance Complexity
  • 相关文献

参考文献16

  • 1Fuchs R. From sequence to biology: the impact on bioinformatics. Bioinformatics, 2002;18 : 505.
  • 2Mount DW. Bioinformatics: sequence and genome analysis.Cold spring harbor laboratory press,NY, 2001.
  • 3Reinert G, Schbath S, Waterman MS. Probabilistic and statistical properties of words, an overview. J Comput Biol,2000;7(1):1.
  • 4Zharkikh AA and Rzhetsky A. Quick assessment of similarity of two sequences by comparison of their L-tuple frequencies.Biosystems, 1993;30:93.
  • 5Hide W, Burke J, Davison DB. Biological evaluation of d2, an algorithm for high-performance sequence comparison. J Comput Biol, 1994;1:199.
  • 6Carpenter JE, Christoffels A, Weinbach Y,et al. Assessment of the parallelization approach of d2 cluster for highperformance sequence clustering. J Comput Chem, 2002;23 :755.
  • 7Van-Heel M. A new family of powerful multivariate statistical sequence analysis techniques.J Mol Biol, 1991;220 : 877.
  • 8Wu TJ, Burke JP,Davison DB. A measure of DNA sequence dissimilarity based on Mahalanobis distance between frequencies of words. Biometrics, 1997;53:1431.
  • 9Wu TJ, Hsieh YC, Li LA. Statistical measures of DNA sequence dissimilarity under Markov chain models of base composition. Biometrics, 2001;57 : 441.
  • 10Stuart GW, Moffett K,Baker S. Integrated gene and species phylogenies from unaligned whole genome protein sequences.Bioinformaties, 2002;18: 100.

同被引文献71

引证文献3

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部