一种新的生物序列模式挖掘算法

Mining Algorithm for Biological Sequence Pattern

下载PDF

导出

摘要针对传统模式挖掘方法挖掘生物序列会生成大量不必要的短而且无用的模式,导致效率降低,在多支持度思想的基础上提出了基于邻近频繁模式段的模式挖掘算法JBioPM。首先,产生邻近短频繁模式段,然后组合这些短频繁模式段,产生新的长频繁模式。通过实验分析,该方法在相似性很强的序列数据库中比BioPM算法效率高。通过对真实的蛋白质序列家族库的处理,证明该算法能有效处理生物序列数据。 Traditional algorithms face efficiency problem because of generating a huge number of unnecessary and useless short pattern in the process of mining.To attack these problems,a novel mining algorithm called JBioPM （Joined Biology sequence Pattern Mining）is presented based on joined frequent pattern segment approach and multi-supports ideology,First, the joined short frequent pattern segments are produced.Then, longer frequent patterns can be obtained by combining the above segments.The experiment shows JBioPM has better performance than BioPM.Through dealing with the real protein family database, it is proved that the algorithm can deal with biology sequence data efficiently.

作者常磊玲朱春鹤 CHANG Lei-ling, ZHU Chun-he （Information Engineering College,Shanghai Maritime University,Shanghai 200135,China）

机构地区上海海事大学信息工程学院

出处《电脑知识与技术》 2010年第7期5140-5142,共3页 Computer Knowledge and Technology

关键词相邻频繁模式段模式组合生物序列模式挖掘数据挖掘生物信息学 joined frequent pattern segment pattern combination biological sequence pattern mining data mining bioinformatics

分类号 TP311 [自动化与计算机技术—计算机软件与理论]

引文网络
相关文献

参考文献4

1Luscombe N M,Greenbaum D,Gerste M.What Is Bioinformatics:A Proposed Definition and Overview of the Field[J].Methods Information in Medicine,2001,40(4):346-358.
2Xiong Yun,Zhu Yangyong.BioPM:An Efficient Algorithm for Protein Motif Mining[C] //Proc.of ICBBE'07.IS.I.J:IEEE Press,2007.
3Wang K,Xu Y,Yu J X.Scalable sequential pattern mining for biological sequences[C] //Proceeding of the 13th ACM International Conference on Information and Knowledge Management.New York:ACM Press,2004:178-187.
4Bateman A,Bimey E,Cerruti L,et al.The Pfam Protein Families Database[J].Nucleic Acids Res,2002,30(1):276-288.

1王淼,尚学群,薛贺.基于相邻模式段组合的生物序列模式挖掘算法[J].计算机工程与应用,2008,44(2):190-193. 被引量：1
2丁智斌,石浩磊.关系数据库设计与规范化[J].计算机与数字工程,2005,33(2):114-116. 被引量：30
3王淼,尚学群,薛贺.基于相邻频繁模式段的闭合序列模式挖掘算法[J].计算机工程与应用,2008,44(11):148-151.
4陈雄峰.一种序列家族Profile HMM寻优的PSO[J].昆明理工大学学报（理工版）,2007,32(1):50-53.
5李梦飞,高琪娟,徐舒,丁仁源.基于MVC模式的电子商务平台设计[J].信息系统工程,2008,21(10):66-67.
6金建刚,包晓安.自适应软件设计模式探讨[J].乐山师范学院学报,2014,29(5):28-32. 被引量：1
7姜淑娟,王令赛,薛猛,张艳梅,于巧,姚慧冉.基于模式组合的粒子群优化测试用例生成方法[J].软件学报,2016,27(4):785-801. 被引量：20
8王南,马永,陈笑蓉.多模式多资源约束下的多项目调度混合算法[J].贵州大学学报（自然科学版）,2015,32(4):65-69.
9陈雄峰.HMM在生物序列分析中的应用[J].闽江学院学报,2007,28(5):52-55.
10龚立,许炎义.基于设计模式的内容管理系统“多站点发布”设计[J].舰船电子工程,2005,25(6):77-80. 被引量：1

电脑知识与技术

2010年第7期

浏览历史

内容加载中请稍等...

一种新的生物序列模式挖掘算法

参考文献4

相关作者

相关机构

相关主题

浏览历史