摘要
在研究并发序列模式和带间隙约束的模式匹配方法基础上,提出间隙模式的概念,并设计基于间隙模式的并发序列模式挖掘算法PBcon.该算法借鉴已有的基于支持量机制,首先挖掘2-分支的并发间隙模式,然后逐步生成3-分支、4-分支及更多分支的并发间隙模式.用真实蛋白质数据对算法进行验证,并与已有的蛋白质相关算法比较,PBcon算法在挖掘效率和挖掘质量上都较有优势.
Based on the study of concurrency sequence pattern and pattern matching method with gap constraint,the concept of gap pattern is proposed and the concurrent sequence pattern mining algorithm based on the gap pattern-PBcon is designed and implemented.Based on the existing support mechanism,the algorithm first mines 2-branch concurrent gap patterns,and then generates 3-branch,4-branch and more branch concurrent gap patterns step by step.The algorithm is validated on real protein data.Compared with existing protein-related algorithms,PBcon algorithm has advantages in mining efficency and quality.
作者
杨梦涛
王翠青
陈未如
YANG Meng-tao;WANG Cui-qing;CHEN Wei-ru(Shenyang University of Chemical Technology,Shenyang 110142,China)
出处
《沈阳化工大学学报》
CAS
2019年第2期183-187,共5页
Journal of Shenyang University of Chemical Technology
基金
辽宁省科技攻关基金项目(2012219001)
辽宁省教育厅科技基金项目(L2013157)
关键词
并发序列模式
生物序列模式挖掘
间隙约束
间隙模式
concurrent sequential patterns
mining of biological sequence patterns
gap constraint
gap pattern