摘要
高效用项集挖掘可以提供有趣的结果集,但并不能提供单个项的数量,因此,本文提出了高效用模糊项集.但是,现实世界的数据是不断出现的,需要实时处理新到来的数据.为解决当前高效用模糊项集不能处理数据流的问题,又提出了模糊效用列表(fuzzy utility list,FUL)结构用于存储当前窗口中项的批次号、项在事务中的事务标识符、项的模糊效用以及项的剩余模糊效用,该结构能有效的对批次进行插入和删除操作.最后,基于FUL提出了数据流高效用模糊项集挖掘算法.对真实数据集和合成数据集进行了广泛的实验,结果证实了算法的效率及可行性.
High-utility itemsets mining(HUI)can provide interesting itemsets,but cannot provide information on the number of items.Therefore,high utility fuzzy itemsets are proposed.However,real-world data is constantly emerging.Thus,new incoming data needs to be processed in real time.To solve the problem that the current high utility fuzzy itemsets cannot handle the data stream,a fuzzy utility list(FUL)structure is proposed to store the information of items,including batch number of items,the transaction identifier of the items,the fuzzy utility of items,and the reminding fuzzy utility of items.FUL can effectively insert and delete batches.Finally,based on FUL,a high utility fuzzy itemset mining algorithm on data stream is proposed,extensive experiments on real and synthetic datasets show the efficiency and feasibility of the algorithm.
作者
单芝慧
韩萌
韩强
Shan Zhihui;Han Meng;Han Qiang(School of Computer Science and Engineering,North Minzu University,Yinchuan 750021,China;The Key Laboratory of Images&Graphics Intelligent Processing of State Ethnic Affairs Commission,North Minzu University,Yinchuan 750021,China)
出处
《南京师大学报(自然科学版)》
CAS
北大核心
2023年第1期120-129,共10页
Journal of Nanjing Normal University(Natural Science Edition)
基金
国家自然科学基金项目(62062004、61862001)
宁夏自然科学基金项目(2020AAC03216)。
关键词
数据流挖掘
滑动窗口
高效用项集挖掘
模糊效用
效用列表
data stream mining
sliding window
high utility itemsets mining
fuzzy utility
utility list