摘要
针对在对分布式、多维数据流频繁模式挖掘算法研究时,没有删除多维数据流中的非频繁项集,存在平均处理时间长的问题,提出分布式多维数据流频繁模式挖掘算法。该方法根据人工神经网络特点,建立了人工神经网络模型,并对多维数据流训练,以达到提升挖掘效率的目的;并基于训练结果构造数据流频繁模式信息树,即频繁模式树(FR-tree:Frequent Pattern tree)。由于FR-tree中存在较多过期的多维数据流,所以需要对FR-tree剪枝,并删除非频繁项集,从而加快频繁模式计算速度,并采用分布式挖掘算法对全局FR-tree挖掘,从中取得多维数据流的频繁项集完全集,实现分布式多维数据流频繁模式的挖掘。通过对该方法的平均处理时间测试,验证了该方法的实用性。
In the research of distributed multidimensional data stream frequent pattern mining algorithm,the non frequent items in multidimensional data stream are not deleted,and there is a problem of long average processing time.A distributed multidimensional data stream frequent pattern mining algorithm based on artificial neural network is proposed.According to the characteristics of artificial neural network,this method establishes an artificial neural network model and trains multi-dimensional data flow,so as to improve the mining efficiency;Based on the training results,a frequent pattern information tree,FR-tree(Frequent Pattern tree),is constructed.Because there are many expired multidimensional data streams in fr tree,it is necessary to prune fr tree and delete non frequent itemsets,so as to speed up the calculation of frequent patterns.Then,the distributed mining algorithm is used to mine the global fr tree to obtain the complete set of frequent itemsets of multidimensional data streams,so as to realize the mining of frequent patterns of distributed multidimensional data streams.The experimental results show that the average processing time of the method is tested to verify the practicability of the method.
作者
施一飞
SHI Yifei(School of Intelligence Technology,Geely University of China,Chengdu 641423,China)
出处
《吉林大学学报(信息科学版)》
CAS
2023年第1期174-179,共6页
Journal of Jilin University(Information Science Edition)
基金
北京市高等教育本科教学改革创新基金资助项目(2022xzky001)。
关键词
人工神经网络
分布式多维数据流
频繁模式
挖掘算法
FR-tree算法
artificial neural network
distributed multi-dimensional data flow
frequent patterns
mining algorithm
frequent pattern tree(FR-tree)