摘要
两条生物序列间的相似性比对是计算生物学探讨的主要问题之一,一种快速的依赖于k-元组的D2shepp统计法目前已被应用到非序列比对中.文中在零模型的基础上产生两条相互独立的随机序列,基于D2shepp统计法进行了两条序列的局部比对,找到局部比对的最优值并求和.在此基础上模拟了Power值的分布情况,并分析了不同k参数下的Power值分布.在相同参数下将文中提出的局部比对与已有的D2shepp统计的全局比对进行比较,发现局部比对D2shepp统计的Power值随着序列长度的增大而快速地接近于1,比全局比对更加快速、准确.
The similarities between two biological sequences is a major issue in computational biology, and a fast D2shepp statistics method based on the joint k-tuple content in two sequences has been used in the alignment-free sequence comparison. In this paper, two separate random sequences are generated based on the zero model, and their local alignment is conducted based on D2shepp statistics, thus obtaining the optimal values and the sum of these values. Then, the Power distribution is simulated and the distributions with different k values are analyzed. Finally, with the same parameters, the proposed local alignment is compared with the global alignment based on D2shepp statistics. It is found that the Power value of the proposed local alignment rapidly approaches 1 with the increase of the sequence length and that the proposed local alignment is quicker and more accurate than the global one.
出处
《华南理工大学学报(自然科学版)》
EI
CAS
CSCD
北大核心
2012年第8期106-109,共4页
Journal of South China University of Technology(Natural Science Edition)
基金
国家自然科学基金资助项目(10947023
61176061)
华南理工大学中央高校基本科研业务费专项资金资助项目(20112M0088)