摘要
给出了页面兴趣度的定义,并针对传统的Apriori关联规则算法必须经过大量反复扫描数据库才能产生候选项集的问题,提出了一种改进算法。此算法将数据库经过预处理后,对事务数据库进行分段,比较时可不针对所有事务记录,从而减少比较时间。最后将页面兴趣度应用于改进的Apriori算法中,形成一种基于页面兴趣度的关联规则算法——I_NEW_AR算法。实验结果表明,该算法不仅提高了挖掘效率,而且应用于网上推荐系统具有较好的准确率。
The authors give the definition of web interest. Aiming at the problem of typical association rules algorithm often requiries a large number of repeated passes over the database to generate the candidate item sets, we present an improved method. After preprocessing the database, we use the subsection technology to separate the database. Thus, the large item sets are generated by contrasts with the partial classified transaction records. This requires less contrast. Finally, we use the web interest to the improved Apriori algorithm. Thus, a new method of association rules based on web page interesting is formed. Experiment shows that this method not only provides efficiency of typical association rules algorithm, but also provides a better precision in recommended system.
出处
《浙江理工大学学报(自然科学版)》
2009年第6期886-890,共5页
Journal of Zhejiang Sci-Tech University(Natural Sciences)