期刊文献+

基于D2shepp统计法的非序列局部比对 被引量:1

Local Alignment-Free Sequences Based on D2shepp Statistics
下载PDF
导出
摘要 两条生物序列间的相似性比对是计算生物学探讨的主要问题之一,一种快速的依赖于k-元组的D2shepp统计法目前已被应用到非序列比对中.文中在零模型的基础上产生两条相互独立的随机序列,基于D2shepp统计法进行了两条序列的局部比对,找到局部比对的最优值并求和.在此基础上模拟了Power值的分布情况,并分析了不同k参数下的Power值分布.在相同参数下将文中提出的局部比对与已有的D2shepp统计的全局比对进行比较,发现局部比对D2shepp统计的Power值随着序列长度的增大而快速地接近于1,比全局比对更加快速、准确. The similarities between two biological sequences is a major issue in computational biology, and a fast D2shepp statistics method based on the joint k-tuple content in two sequences has been used in the alignment-free sequence comparison. In this paper, two separate random sequences are generated based on the zero model, and their local alignment is conducted based on D2shepp statistics, thus obtaining the optimal values and the sum of these values. Then, the Power distribution is simulated and the distributions with different k values are analyzed. Finally, with the same parameters, the proposed local alignment is compared with the global alignment based on D2shepp statistics. It is found that the Power value of the proposed local alignment rapidly approaches 1 with the increase of the sequence length and that the proposed local alignment is quicker and more accurate than the global one.
出处 《华南理工大学学报(自然科学版)》 EI CAS CSCD 北大核心 2012年第8期106-109,共4页 Journal of South China University of Technology(Natural Science Edition)
基金 国家自然科学基金资助项目(10947023 61176061) 华南理工大学中央高校基本科研业务费专项资金资助项目(20112M0088)
关键词 非序列比对 D2shepp统计法 局部比对 Power值 alignment-free sequence D2shepp statistics local alignment Power value
  • 相关文献

参考文献15

  • 1Needleman S B, Wunsch C D. A general method applica- ble to the search for similarities in the amino acid se- quence of two proteins [ J]. Journal of Molecular Biology, 1970,48 ( 3 ) :443-453.
  • 2Smith T F, Waterman M S. Identification of common mo- lecular subsequences [ J ]. Journal of Molecular Biology, 1981,147(1) :195-197.
  • 3Altschul S F, Gish W, Miler W, et at. Basic local align- ment search tool [ J]. Journal of Molecular Biology, 1990, 215 (3) :403-410.
  • 4Pearson W R, Lipman D J. Improved tools for biological sequence comparison [ J ]. Proceedings of the National Academy of Sciences of the United States of America, 1988,85(8) :2444-2448.
  • 5Altschul S F, Madden T L, Sehfer A A, et al. Gapped blast and PSI-blast:a new generation of protein database search programs[J]. Nucleic Acids Research, 1997,25 (17) :3389-3402.
  • 6Thompson J D, Higgins D G, Gibson T J. CLUSTAL W : improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice [ J ]. Nucleic Acids Research, 1994,22(22) :4673-4680.
  • 7Koohy H, Dyer N P, Reid J E, et al. An alignment-free model for comparison of regulatory sequences [ J ]. Bioin- fonnatics,2010,26(19) :2391-2397.
  • 8Wu Q L, Kong D, Lam K, et al. A mycobaeterial extracyto- plasmic function sigma factor involved in survival follow- ing stress [ J ]. Journal of Bacteriology, 1997,179 ( 9 ) : 2922-2929.
  • 9Lippert R A, Huang H, Waterman M S. Distributional re- gimes for the number of k-word matches between two ran- dom sequences [ J ], Proceedings of the National Academy of Sciences of the United States of America, 2002,99 (22) : 13980-13989.
  • 10Kantorovitz M R, Booth H S, Burden C J, et al. Approxi- mate word matches between two random sequences [ J ]. Journal of Applied Probability,2007,44:788-805.

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部