摘要
在加权序列模式挖掘中,基于候选码生成-测试方法的MWSP是目前应用性最好的算法之一,然而在挖掘过程中容易出现候选组合爆炸的情况,为此文章提出了一种高效的加权序列模式挖掘算法(PWSM)。PWSM算法引入k-最小加权支持数概念并利用前缀投影数据库原理有效地避免了候选组合爆炸的发生,并且在挖掘的过程中充分利用最小加权支持数,再次对算法进行优化。实验表明,该算法较MWSP算法能更加有效地从序列数据库中挖掘加权序列模式。
In the weighted sequential pattern mining,the algorithm MWSP is one of the best algorithms,but during the mining process,it will easily generate the situation of candidate combinatorial explosion because of basing on the candidate generation-and-test approach,therefore,this paper presents an efficient algorithm PWSM,which introduces the concept of K-minimum weighted support,utilizes the principle of prefix projection database to avoid the occurrence of candidate combinatorial explosion,and takes full advantage of the minimum weighted support to optimize the algorithm.The experimental results show that the algorithm PWSM is more effective than the algorithm MWSP on mining weighted sequential patterns from the sequence database.
出处
《计算机与数字工程》
2010年第11期4-9,共6页
Computer & Digital Engineering
基金
国家自然科学基金项目(编号:61070047
61070133
61003180)
江苏省自然科学基金项目(编号:BK2008206
BK21010311)
江苏省教育厅自然科学基金项目(编号:08KJB520012
09KJB20013)资助
关键词
数据挖掘
加权序列模式
加权支持数
data mining
weighted sequential pattern
weighted support