期刊文献+
共找到6篇文章
< 1 >
每页显示 20 50 100
Frequent Itemset Mining of User’s Multi-Attribute under Local Differential Privacy 被引量:2
1
作者 Haijiang Liu Lianwei Cui +1 位作者 Xuebin Ma Celimuge Wu 《Computers, Materials & Continua》 SCIE EI 2020年第10期369-385,共17页
Frequent itemset mining is an essential problem in data mining and plays a key role in many data mining applications.However,users’personal privacy will be leaked in the mining process.In recent years,application of ... Frequent itemset mining is an essential problem in data mining and plays a key role in many data mining applications.However,users’personal privacy will be leaked in the mining process.In recent years,application of local differential privacy protection models to mine frequent itemsets is a relatively reliable and secure protection method.Local differential privacy means that users first perturb the original data and then send these data to the aggregator,preventing the aggregator from revealing the user’s private information.We propose a novel framework that implements frequent itemset mining under local differential privacy and is applicable to user’s multi-attribute.The main technique has bitmap encoding for converting the user’s original data into a binary string.It also includes how to choose the best perturbation algorithm for varying user attributes,and uses the frequent pattern tree(FP-tree)algorithm to mine frequent itemsets.Finally,we incorporate the threshold random response(TRR)algorithm in the framework and compare it with the existing algorithms,and demonstrate that the TRR algorithm has higher accuracy for mining frequent itemsets. 展开更多
关键词 Local differential privacy frequent itemset mining user’s multi-attribute
下载PDF
FPGA-Based Stream Processing for Frequent Itemset Mining with Incremental Multiple Hashes
2
作者 Kasho Yamamoto Masayuki Ikebe +1 位作者 Tetsuya Asai Masato Motomura 《Circuits and Systems》 2016年第10期3299-3309,共11页
With the advent of the IoT era, the amount of real-time data that is processed in data centers has increased explosively. As a result, stream mining, extracting useful knowledge from a huge amount of data in real time... With the advent of the IoT era, the amount of real-time data that is processed in data centers has increased explosively. As a result, stream mining, extracting useful knowledge from a huge amount of data in real time, is attracting more and more attention. It is said, however, that real- time stream processing will become more difficult in the near future, because the performance of processing applications continues to increase at a rate of 10% - 15% each year, while the amount of data to be processed is increasing exponentially. In this study, we focused on identifying a promising stream mining algorithm, specifically a Frequent Itemset Mining (FIsM) algorithm, then we improved its performance using an FPGA. FIsM algorithms are important and are basic data- mining techniques used to discover association rules from transactional databases. We improved on an approximate FIsM algorithm proposed recently so that it would fit onto hardware architecture efficiently. We then ran experiments on an FPGA. As a result, we have been able to achieve a speed 400% faster than the original algorithm implemented on a CPU. Moreover, our FPGA prototype showed a 20 times speed improvement compared to the CPU version. 展开更多
关键词 Data mining Frequent itemset mining FPGA Stream Processing
下载PDF
PHUI-GA: GPU-based efficiency evolutionary algorithm for mining high utility itemsets
3
作者 JIANG Haipeng WU Guoqing +3 位作者 SUN Mengdan LI Feng SUN Yunfei FANG Wei 《Journal of Systems Engineering and Electronics》 SCIE CSCD 2024年第4期965-975,共11页
Evolutionary algorithms(EAs)have been used in high utility itemset mining(HUIM)to address the problem of discover-ing high utility itemsets(HUIs)in the exponential search space.EAs have good running and mining perform... Evolutionary algorithms(EAs)have been used in high utility itemset mining(HUIM)to address the problem of discover-ing high utility itemsets(HUIs)in the exponential search space.EAs have good running and mining performance,but they still require huge computational resource and may miss many HUIs.Due to the good combination of EA and graphics processing unit(GPU),we propose a parallel genetic algorithm(GA)based on the platform of GPU for mining HUIM(PHUI-GA).The evolution steps with improvements are performed in central processing unit(CPU)and the CPU intensive steps are sent to GPU to eva-luate with multi-threaded processors.Experiments show that the mining performance of PHUI-GA outperforms the existing EAs.When mining 90%HUIs,the PHUI-GA is up to 188 times better than the existing EAs and up to 36 times better than the CPU parallel approach. 展开更多
关键词 high utility itemset mining(HUIM) graphics process-ing unit(GPU)parallel genetic algorithm(GA) mining perfor-mance
下载PDF
Hadamard Encoding Based Frequent Itemset Mining under Local Differential Privacy 被引量:1
4
作者 赵丹 赵素云 +3 位作者 陈红 刘睿瑄 李翠平 张晓莹 《Journal of Computer Science & Technology》 SCIE EI CSCD 2023年第6期1403-1422,共20页
Local differential privacy(LDP)approaches to collecting sensitive information for frequent itemset mining(FIM)can reliably guarantee privacy.Most current approaches to FIM under LDP add"padding and sampling"... Local differential privacy(LDP)approaches to collecting sensitive information for frequent itemset mining(FIM)can reliably guarantee privacy.Most current approaches to FIM under LDP add"padding and sampling"steps to obtain frequent itemsets and their frequencies because each user transaction represents a set of items.The current state-of-the-art approach,namely set-value itemset mining(SVSM),must balance variance and bias to achieve accurate results.Thus,an unbiased FIM approach with lower variance is highly promising.To narrow this gap,we propose an Item-Level LDP frequency oracle approach,named the Integrated-with-Hadamard-Transform-Based Frequency Oracle(IHFO).For the first time,Hadamard encoding is introduced to a set of values to encode all items into a fixed vector,and perturbation can be subsequently applied to the vector.An FIM approach,called optimized united itemset mining(O-UISM),is pro-posed to combine the padding-and-sampling-based frequency oracle(PSFO)and the IHFO into a framework for acquiring accurate frequent itemsets with their frequencies.Finally,we theoretically and experimentally demonstrate that O-UISM significantly outperforms the extant approaches in finding frequent itemsets and estimating their frequencies under the same privacy guarantee. 展开更多
关键词 local differential privacy frequent itemset mining frequency oracle
原文传递
A Parallel High-Utility Itemset Mining Algorithm Based on Hadoop 被引量:1
5
作者 Zaihe Cheng Wei Shen +1 位作者 Wei Fang Jerry Chun-Wei Lin 《Complex System Modeling and Simulation》 2023年第1期47-58,共12页
High-utility itemset mining(HUIM)can consider not only the profit factor but also the profitable factor,which is an essential task in data mining.However,most HUIM algorithms are mainly developed on a single machine,w... High-utility itemset mining(HUIM)can consider not only the profit factor but also the profitable factor,which is an essential task in data mining.However,most HUIM algorithms are mainly developed on a single machine,which is inefficient for big data since limited memory and processing capacities are available.A parallel efficient high-utility itemset mining(P-EFIM)algorithm is proposed based on the Hadoop platform to solve this problem in this paper.In P-EFIM,the transaction-weighted utilization values are calculated and ordered for the itemsets with the MapReduce framework.Then the ordered itemsets are renumbered,and the low-utility itemsets are pruned to improve the dataset utility.In the Map phase,the P-EFIM algorithm divides the task into multiple independent subtasks.It uses the proposed S-style distribution strategy to distribute the subtasks evenly across all nodes to ensure load-balancing.Furthermore,the P-EFIM uses the EFIM algorithm to mine each subtask dataset to enhance the performance in the Reduce phase.Experiments are performed on eight datasets,and the results show that the runtime performance of P-EFIM is significantly higher than that of the PHUI-Growth,which is also HUIM algorithm based on the Hadoop framework. 展开更多
关键词 pattern mining data mining HADOOP PARALLEL high-utility itemset mining big data
原文传递
Parallel Incremental Frequent Itemset Mining for Large Data 被引量:5
6
作者 Yu-Geng Song Hui-Min Cui Xiao-Bing Feng 《Journal of Computer Science & Technology》 SCIE EI CSCD 2017年第2期368-385,共18页
Frequent itemset mining (FIM) is a popular data mining issue adopted in many fields, such as commodity recommendation in the retail industry, log analysis in web searching, and query recommendation (or related sea... Frequent itemset mining (FIM) is a popular data mining issue adopted in many fields, such as commodity recommendation in the retail industry, log analysis in web searching, and query recommendation (or related search). A large number of FIM algorithms have been proposed to obtain better performance, including parallelized algorithms for processing large data volumes. Besides, incremental FIM algorithms are also proposed to deal with incremental database updates. However, most of these incremental algorithms have low parallelism, causing low efficiency on huge databases. This paper presents two parallel incremental FIM algorithms called IncMiningPFP and IncBuildingPFP, implemented on the MapReduce framework. IncMiningPFP preserves the FP-tree mining results of the original pass, and utilizes them for incremental calculations. In particular, we propose a method to generate a partial FP-tree in the incremental pass, in order to avoid unnecessary mining work. Further, some of the incremental parallel tasks can be omitted when the inserted transactions include fewer items. IncbuildingPFP preserves the CanTrees built in the original pass, and then adds new transactions to them during the incremental passes. Our experimental results show that IncMiningPFP can achieve significant speedup over PFP (Parallel FPGrowth) and a sequential incremental algorithm (CanTree) in most cases of incremental input database, and in other cases IncBuildingPFP can achieve it. 展开更多
关键词 incremental parallel FPGrowth data mining frequent itemset mining MAPREDUCE
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部