期刊文献+

基于层次频繁模式树的数据自动挖掘算法 被引量:2

Automatic data mining algorithm based on hierarchical frequent pattern tree
下载PDF
导出
摘要 在大规模数据中包含过多的冗余信息,当前算法表达事物不够清晰,导致数据信息不能够完全被挖掘,操作效率过低。为此,提出了基于层次频繁模式树设计数据的自动挖掘算法。基于层次频繁模式树定义挖掘任务,以候选集剪枝思想建立数据自动连接矩阵,利用最小支持度裁剪队列自动挖掘数据,完成基于层次频繁模式树的数据自动挖掘算法设计。实验结果表明:动车组的运维数据作为测试样本,分别对不同总量的数据进行挖掘,研究算法能够在规定时间内将数据表达完全,以200万条数据为例本文算法比传统算法的挖掘数量,分别超出了10万条和8万条,提高了其工作效率。 There is too much redundant information in large-scale data,and the current algorithm is not clear enough to express things,which leads to data information that can not be completely mined and the operation efficiency is too low.Therefore,an automatic data mining algorithm based on a hierarchical frequent pattern tree is proposed.The mining task is defined based on the hierarchical frequent pattern tree,the automatic data connection matrix is established based on the idea of candidate set pruning,the data is automatically mined by using the minimum support pruning queue,and the automatic data mining algorithm design based on the hierarchical frequent pattern tree is completed.The experiment results show that the operation and maintenance data of EMU are used as test samples to mine different amounts of data respectively,and the research algorithm can fully express the data within the specified time.Taking 2 million data as an example,the mining number of the proposed algorithm exceeds 100000 and 80000 respectively compared with traditional algorithms,which can improve the work efficiency.
作者 王景兰 方晓 WANG Jinglan;FANG Xiao(Department of Information Engineering,Bozhou Vocational and Technical College,Bozhou 236800,Anhui,China)
出处 《上海电机学院学报》 2022年第4期239-242,248,共5页 Journal of Shanghai Dianji University
基金 安徽省职业教育创新发展试验区资助项目(WJ-ZYPX-003) 安徽省级质量工程资助项目(2020jxtd173) 2020年安徽省高校人文科学研究资助项目(SK2020A0778) 2020亳州职业技术学院人文科学研究资助项目(BYK2029)。
关键词 层次频繁模式树 数据自动挖掘 相关规则 数据源 连接矩阵 hierarchical frequent pattern tree automatic data mining relevant rules data source connection matrix
  • 相关文献

参考文献10

二级参考文献73

  • 1乔少杰 唐常杰 陈瑜等.基于树编辑距离的层次聚类算法.计算机科学与探索,2007,1(3):282-292.
  • 2Embley D W, Jiang Yuan, Ng Y K. Record-boundary Dis- covery in Web Documents[C]//Proc. of ACM SIGMOD Inter- national Conference on Management of Data, New York, USA Is. n.], 1999.
  • 3Buttler D, Liu Ling, Pu C. A Fully Automated Object Extraction System for the World Wide Web[C]//Proc. of the 21st International Conference on Distributed Computing Systems. New York, USA: Is. n.], 2001.
  • 4Liu Bing, Grossman R, Zhai Yanhong. Mining Data Records in Web Pages[C]//Proc. of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, USA: Is. n.], 2003.
  • 5Reis D C, Golgher P B, Silva A S, et al. Automatic Web News Extraction Using Tree Edit Distance[C]//Proc. of the 13th International Conference on World Wide Web. New York, USA: [s. n.], 2004.
  • 6Tai Kuochung. The Tree-to-Tree Correction Problem[J]. Journal of the ACM, 1979, 26(3): 422-433.
  • 7黄德才,张良燕,龚卫华,刘端阳.一种改进的关联规则增量式更新算法[J].计算机工程,2008,34(10):38-39. 被引量:21
  • 8冯玉才,冯剑琳.关联规则的增量式更新算法[J].软件学报,1998,9(4):301-306. 被引量:227
  • 9姜波,丁岳伟.基于约束树编辑距离与导航树的信息采集[J].计算机工程,2009,35(14):75-77. 被引量:9
  • 10刘守群,朱明,谭晓彬.一种基于树匹配的网页语义块挖掘算法[J].小型微型计算机系统,2009,30(8):1541-1545. 被引量:7

共引文献24

同被引文献33

引证文献2

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部