摘要
非编码区重复序列分析在基因组研究中起着重要作用,其基础就是在非编码DNA序列中识别和定位所有的重复结构。重复序列识别问题在计算机科学中主要体现为字符串匹配问题。在分析了后缀树和后缀数组字符串匹配算法的基础上,详细阐述了基于后缀数组的精确串联重复序列识别方法。实验表明,该方法适合用于非编码DNA序列分析。
The repeat sequences analysis of non-coding area plays an important role in the research of genomes, its foundation is to identify and locate the periodic patterns. It addresses the method of identifying the accurate tandem repeat in detail after analyzing suffix tree and suffix array algorithms of string matching. The experiment indicates that the method adapts to non-coding DNA sequence analysis.
作者
陈昌平
刘自伟
周文鹃
彭春艳
CHEN Chang-ping, LIU Zi-wei, ZHOU Wen-juan, PENG Chun-yan (Southwest University of Science and Technology, College of Computer Science and Technology, Mianyang 621000, China)
出处
《电脑知识与技术》
2008年第11期930-931,937,共3页
Computer Knowledge and Technology