摘要
1引言
生物信息学中,对各种生物大分子序列进行分析是一件非常基本的工作,Paul A.Rota[1]通过测定SARS(又名非典型性肺炎)病毒的基因组序列,找到了含有制造蛋白质指令的部分基因,其中包括公认的制造四种蛋白质的基因,证实了非典病毒是一种全新的冠状病毒,这一成果将有助于加快诊断、治疗和预防非典的步伐.
The local alignment problem for two sequences requires determining similar regions, one from each sequence, and aligning those regions. The Smith-Waterman algorithm for local sequence alignment is one of the most well-known algorithm in computational molecular biology. This ingenious dynamic programming approach is designed to reveal the highly conserved fragments by discarding poorly conserved initial and terminal segments. However, the local alignment sometimes produces a mosaic of well conserved fragments artificially connected by poorly conserved or even unrelated fragments. This may lead to problems in comparison of long genomic sequences and comparative gene prediction. In this paper we propose a new strategy of dynamic penalty strategy to fix this problem. In the process of computing similarity matrix, if similarity value is larger than the pre-specified threshold X then starting our strategy, when related character mismatches, then penalizing more than others until similarity value is 0 or the process ends. Test results show that this algorithm has better performance by comparison to the standard Smith-Waterman while dose not increase signally the computational complexity both in time and space.
出处
《计算机科学》
CSCD
北大核心
2003年第11期44-47,共4页
Computer Science
基金
电子科学基金51415010101DZo2