期刊文献+

基于贝叶斯粗糙集的大数据频繁项挖掘技术 被引量:3

Large Data Frequent Item Mining Technology Based on Bayesian Rough Set
下载PDF
导出
摘要 对大数据的频繁项集挖掘是关联规则挖掘的关键步骤,通过有效的频繁项挖掘提高大数据量数据库的访问效率。传统方法中对大数据的频繁项集挖掘采用FP-Growth的粗糙集挖掘算法,扩展性和容错性不好。提出一种基于贝叶斯粗糙集的大数据频繁项挖掘技术,引入后缀项表的概念,通过后缀项表的构建,保留频繁项集的完整信息。构建FP-Tree,生成闭频繁项集,计算样本的密度,并抽取高密度区域的点集作为聚类中心集合,进行后缀项表的构造,按支持度分成若干集合,对各约简集内的属性集合进行融合,用变精度粗糙集的贝叶斯粗糙进行数据挖掘算法改进,仿真结果表明,算法不受可变参数的影响,鲁棒性较高,数据挖掘的准确度较高,运行时间较短。算法将在人工智能和数据挖掘领域具有更广的应用前景。 The frequent itemsets on data mining is a key step of association rule mining, through frequent item mined effec?tively, it can improve the access efficiency of large quantities of data database. The rough set algorithm for mining frequent item sets in the traditional method on data mining using FP-Growth, scalability and fault tolerance is not good. Put forward a kind of data mining technology based on large Bayesian Rough set of frequent items, introducing the concept of suffix ta?ble, by constructing a suffix table, complete information remain frequent item sets. Construction of FP-Tree, the generation of closed frequent itemsets, calculate the sample density, and extract the regions of high density point set as the clustering center, constructed suffix table, according to the degree of support is divided into a plurality of sets, attribute of each reduc?tion set within the set of fusion, using Bayesian variable precision rough sets rough data improved data mining algorithm, simulation results show that the algorithm is not affected by the impact of variable parameters, high robustness, data mining is of high accuracy and short running time. The algorithm will have more wide prospect of application in the field of artifi?cial intelligence and data mining.
作者 张本文
出处 《科技通报》 北大核心 2015年第6期211-213,共3页 Bulletin of Science and Technology
基金 四川省教育厅自然科学基金No.13ZA0136
关键词 贝叶斯粗糙集 频繁项挖掘 大数据 Bayesian Rough Set frequent item mining large data
  • 相关文献

参考文献5

  • 1陆科达,万励,吴洁明.基于数据挖掘技术的网络安全事件预测研究[J].科技通报,2012,28(6):37-39. 被引量:14
  • 2Dean J,Ghemawat S.MapReduce:simplified data process.ing on large clusters[J].Communications of the ACM,2008,51(1):107-113.
  • 3Li H,Wang Y,Zhang D,,et al. Pfp: parallel fp-growth forquery recommendation[C].//Proceedings of the 2008 ACMconference on Recommender systems,2008:107-114.
  • 4Owen S,Anil,Dunning T,et al.Mahout in action [M].Man.ning,2011.
  • 5Wang SQ,Yang YB,Gao Y,et al. MapReduce- basedClosed Frequent Itemset Mining with Efficient RedundancyFiltering[C]//Data Mining Workshops (ICDMW),2012IEEE 12th International Conference,2012: 449-453.

二级参考文献5

共引文献13

同被引文献36

引证文献3

二级引证文献8

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部