期刊文献+

一种矩阵和排序索引关联规则数据挖掘算法 被引量:9

A Data Mining Algorithm for Matrix and Sort Index Association Rules
下载PDF
导出
摘要 在关联规则挖掘算法中,Apriori由于多次对数据库进行扫描会产生较多的候选集,在多次扫描数据库的情况下容易产生I/O开销问题,并引起数据挖掘效率低。矩阵关联规则在数据挖掘过程中没有删除非频繁项集,致使存在较多的无效扫描,对于挖掘效率的提高也不明显。该文提出了一种改进的矩阵和排序索引关联规则数据挖掘算法,首先,删除不需要的事务和项,通过矩阵相乘和查找表获得频繁的二项式集合,结合排序索引得到剩下的频繁k-项集。与矩阵关联规则算法和Apriori算法进行比较,提出的算法可以直接查找频繁项集并对数据库进行扫描,当产生频繁项集比较多或者数据库需要进行动态更新时,该算法具有较好的可行性和执行效率。实验表明,提出的矩阵排序索引算法很好地降低了内存的使用率和I/O的开销,提高了数据挖掘的效率且具有较好的可扩展性。 In the association rule mining algorithm,Apriori is prone to I/O overhead and low efficiency of data mining due to the fact that multiple scans of the database will generate many candidate sets.Matrix association rules do not delete infrequent item sets in the data mining process,resulting in many invalid scans,and the improvement of mining efficiency is not obvious.An improved data mining algorithm for matrix and sorted index association rules is proposed.First,unwanted transactions and items are deleted,frequent binomial sets are obtained by matrix multiplication and lookup tables,and the remaining frequent k-item sets are obtained by combining the sorted index.Compared with the matrix association rule algorithm and Apriori algorithm,the proposed algorithm can directly find frequent item sets and scan the database.When there are more frequent item sets or the database needs to be dynamically updated,the proposed algorithm has better feasibility and execution efficiency.Experiment shows that the proposed algorithm reduces memory utilization and I/O overhead,improves data mining efficiency and has better scalability.
作者 刘彦戎 杨云 LIU Yan-rong;YANG Yun(School of Information and Engineering,Shaanxi Institute of International Trade&Commerce,Xi’an 712000,China;School of Electronic Information and Artificial Intelligence,Shaanxi University of Science&Technology,Xi’an 710021,China)
出处 《计算机技术与发展》 2021年第2期54-59,共6页 Computer Technology and Development
基金 陕西省重点研发计划(2019NY-185) 陕西省自然科学基金(2017JM6111)。
关键词 数据挖掘 关联规则 APRIORI算法 矩阵算法 排序索引 序列标记 data mining association rules Apriori algorithm matrix algorithm sorting index sequence marker
  • 相关文献

参考文献7

二级参考文献64

共引文献278

同被引文献84

引证文献9

二级引证文献21

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部