期刊文献+

不确定性数据上频繁项集挖掘的预处理方法 被引量:10

Uncertain Data Preconditioning Method in Frequent Itemset Mining
下载PDF
导出
摘要 传统频繁项集挖掘技术无法高效获取不确定性数据中有价值的信息。通过研究频繁模式增长树的算法原理,根据不确定性数据的特点提出了一种有效的不确定性数据预处理方法PCAFP-Growth。利用主成分分析的方法进行数据的降维,并使用模糊关联分析法将数据概率进行分类,实现数据剪枝。在理论研究基础上,通过实验对数据集进行了验证。结果表明,基于主成分分析法的剪枝策略在稠密数据集上能够有效提高运算速度,减少内存的使用。 Traditional studies of frequent itemset mining cannot obtain information from uncertain data efficiently. We studied the frequent pattern tree and proposed an effective uncertain data preconditioning method, the PCAFP-Growth, which can reduce the itemset dimensions with principal component analysis method, and prune data with fuzzy associa- tion analysis. Our experimental results over real world datasets show that our method is effective and efficient.
出处 《计算机科学》 CSCD 北大核心 2012年第7期161-164,199,共5页 Computer Science
基金 国家自然科学基金项目(61100112) 中央财经大学科研创新团队支持计划资助
关键词 不确定性数据 频繁项集 主成分分析 模糊关联 Uncertain data, Frequent itemset, Principle component analysis, Fuzzy association
  • 相关文献

参考文献15

  • 1周傲英,金澈清,王国仁,李建中.不确定性数据管理技术研究综述[J].计算机学报,2009,32(1):1-16. 被引量:185
  • 2李建中 于戈 周傲英.不确定性数据管理的要求与挑战[J].中国计算机学会通讯,2009,5(4):6-14.
  • 3Pei H J, Yin Y. Mining frequent patterns without candidate ge- neration[C3//International Conference of SIGMOD. 2000.
  • 4Chui C, Kao B, Hung E. Mining frequent itemsets from uncer- tain dataC]//International Conference of Pacific-Asia Know- ledge Discovery and Data Mining. 2007.
  • 5Chui C, Kao B. A decremental approach for mining frequent itemsets from uncertain dataC]//International Conference of Pacific-Asia Knowledge Discovery and Data Mining. 2008.
  • 6Leung C, Matco M A F, Brajczuk D A. A tree-based approach for frequent pattern mining from uncertain data[C]//Interna- tional Conference of PacificAsia Knowledge Discovery and Data Mining. 2008.
  • 7Aggarwal C C, Li Y, Wang J, et al. Frequent pattern mining with uncertain data[C]//International Conference on Knowledge Dis- covery and Data Mining. 2009.
  • 8Bernecker T, Kriegel H, Renz M, et al. Probabilistic frequent itemset mining in uncertain databases[C]3 // International Con- ference on Knowledge Discovery and Data Mining. 2009.
  • 9Muntz R, Mining Y. Frequent Itemsets in Uncertain Datasets [R]. CSDTR No. 030042.
  • 10Tobji M A B, Yaghlane B B, Mellouli K. Frequent Itemset Mi- ning from Databases Including One Evidential Attribute[C]// International Conference on Scalable Uncertainty Management. 2008.

二级参考文献98

  • 1金澈清,钱卫宁,周傲英.流数据分析与管理综述[J].软件学报,2004,15(8):1172-1181. 被引量:161
  • 2谷峪,于戈,张天成.RFID复杂事件处理技术[J].计算机科学与探索,2007,1(3):255-267. 被引量:54
  • 3Deshpande A, Guestrin C, Madden S, Hellerstein J M, Hong W. Model-driven data acquisition in sensor networks// Proceedings of the 30th International Conference on Very Large Data Bases. Toronto, 2004:588-599
  • 4Madhavan J, Cohen S, Xin D, Halevy A, Jeffery S, Ko D, Yu C. Web-scale data integration: You can afford to pay as you go//Proceedings of the 33rd Biennial Conference on Innovative Data Systems Research. Asilomar, 2007:342-350
  • 5Liu Ling. From data privacy to location privacy: Models and algorithms (tutorial)//Proceedings of the 33rd International Conference on Very Large Data bases. Vienna, 2007: 1429- 1430
  • 6Samarati P, Sweeney L. Generalizing data to provide anonymity when disclosing information (abstract)//Proeeedings of the 17th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. Seattle, 1998:188
  • 7Cavallo R, Pittarelli M. The theory of probabilistic databases//Proceedings of the 13th International Conference on Very Large Data Bases. Brighton, 1987:71-81
  • 8Barbara D, Garcia-Molina H, Porter D. The management of probabilistic data. IEEE Transactions on Knowledge and Data Engineering, 1992, 4(5): 487-502
  • 9Fuhr N, Rolleke T. A probabilistic relational algebra for the integration of information retrieval and database systems. ACM Transactions on Information Systems, 1997, 15(1): 32-66
  • 10Zimanyi E. Query evaluation in probabilistic databases. Theoretical Computer Science, 1997, 171(1-2): 179-219

共引文献185

同被引文献65

  • 1刘殷雷,刘玉葆,陈程.不确定性数据流上频繁项集挖掘的有效算法[J].计算机研究与发展,2011,48(S3):1-7. 被引量:14
  • 2陈爱东,刘国华,费凡,周宇,万小妹,貟慧.满足均匀分布的不确定数据关联规则挖掘算法[J].计算机研究与发展,2013,50(S1):186-195. 被引量:18
  • 3谈恒贵,王文杰,李克双.频繁项集挖掘算法综述[J].计算机仿真,2005,22(11):1-4. 被引量:6
  • 4Xin Dong, Laure Berfi Equille ,Yifan Hu, Divesh Srivasta va. Global Detection of Complex Copyitlg Relationships Between Sources[J]. Proceedings of the VLDB Endowment, 2010,3(1 2):1358-1369.
  • 5Tobji M A B,Yaghlane B B,Mellouli K.A new algorithm for mining frequent item-sets from evidential database[C]//Scalable Uncertainty Management.Berlin:Springer,2008:19-32.
  • 6Chui C K,Kao B,Hung E.Ming frequent itemsets from uncertain data[C]//Advances in Knowledge Discovery and Data Mining.Berlin:Springer,2007:47-58.
  • 7Leung C K S,Mateo M A F,Brajczuk D A.A tree-based approach for frequent pattern mining from uncertain data[C]//Advances in Knowledge Discovery and Data Mining.Berlin:Springer,2008:653-661.
  • 8Leung C K S,Hao B.Mining of frequent items from streams of uncertain data[C]//Proceedings of the 25th International Conference on Data Engineering.New Jersey:IEEE Press,2009:1663-1670.
  • 9CHUI C-K, KAO B, HUNG E. Mining frequent itemsets from uncertain data [C] // PAKDD 2007: Proceedings of the 11th Pacific-Asia Conference on Knowledge Discovery and Data Mining, LNCS 4426. Berlin: Springer, 2007: 47-58.
  • 10WANG L, CHEUNG D W, CHENG R, et al. Efficient mining of frequent itemsets on large uncertain databases [J]. IEEE Transactions on Knowledge and Data Engineering, 2012, 24(12): 2170-2183.

引证文献10

二级引证文献24

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部