摘要
传统网络异构的数据挖掘算法以数据间的关联性为基础进行聚类,当数据集中出现大量冗余数据时,数据间的关联性减弱,使得数据挖掘精确度降低。为解决这个问题,提出一种新的弱关联冗余环境下挖掘算法。该算法先通过数据聚类方法,确定大数据集的原始聚类中心,不断更新聚类中心确保其逼近真实中心,实现大数据集的数据聚类。再对大数据集的弱关联规则进行挖掘,计算弱关联规则下数据间的关联性,采用弱化关联规则方法,挖掘出弱关联冗余环境下的数据。实验结果表明:所提挖掘算法具有较高的挖掘效率和精度,以及较低的复杂度。
The traditional data mining algorithm for heterogeneous network is based on the correlation betw^een data to make clustering.When a large number of redundant data occur,the correlation between data is weakened and it makes the accuracy of data mining decrease.To solve this problem,a new^mining algorithm based on weakly correlation redundant environment is proposed in this paper.Firstly,in this algorithm,the original cluster center of the big data set is determined through the data clustering method,and the cluster center is updated to en sure that it is close to the real center,so as to realize the data clustering of big data set.Then,the weak association rule of big data set is mined to calculate the association betw^een the data in the weak association rule.Finally,the weak association rule is used to mine the data in the weak association redundant environment.The experimental results show^that the proposed mining algorithm has higher mining efficiency and accuracy,as well as low^er complexity.
作者
盛昀瑶
沈阳
Yun-yao SHENG;Yang SHEN(School of Information Engineering,Changzhou Vocational Institute Of Mechatronic Technology,ChangZhou 213164,China;Institute of Computer Systems,South China University of Technology,Guangzhou 510006,China)
出处
《机床与液压》
北大核心
2018年第18期186-192,共7页
Machine Tool & Hydraulics
关键词
弱关联
冗余
挖掘
算法
聚类
关联规则
Weak association
Redundancy
Data mining
Algorithm,Clustering
Association rule