摘要
对区间值属性数据集进行挖掘,可以有效分析出数据之间的关系。针对现有数据挖掘方法未对大规模数据进行聚类,导致挖掘过程占据内存大,挖掘精度低的问题,提出了一种新的区间值属性数据集挖掘算法。对问题定义、数据准备、数据提取、模式预测和数据聚类等模块进行详细分析,完成区间值属性数据聚类。根据聚类结果,将区间值属性数据分成多个数据集,挑选出能够支持最小支持度的项目集,将这些项目集作为频繁项集,进而提取出数据集之间的关联规则,将关联规则融入数据计算步骤,完成数据挖掘。为验证算法效果,进行仿真,结果表明,相较于传统挖掘算法,所提挖掘算法占用容量更小,挖掘精度更高。
The mining of Interval-valued attribute data sets can effectively analyze the relationship between data.At present,the data mining methods do not cluster the large data,leading to large memory and low accuracy.There-fore,this article puts forward a new algorithm to mine the interval-valued attribute data set.The modules such as the problem definition,data preparation,data extraction,pattern prediction and data clustering were analyzed in detail,so that the interval-valued attribute data clustering was completed.According to the clustering results,the interval-valued attribute data were divided into several datasets.And then,the item sets which could support the minimum support degree were chosen as the frequent item sets.After that,the association rules between data sets were extrac-ted.Finally,the association rules were integrated into the calculation step of data,and thus the data mining was com-pleted.The simulation result shows that the proposed algorithm has smaller capacity and higher mining accuracy than the traditional mining algorithms.
作者
王晓鹏
WANG Xiao-peng(Guizhou University,School of Computer Science and Technology,Guiyang Guizhou 550025,China)
出处
《计算机仿真》
北大核心
2020年第1期234-238,共5页
Computer Simulation
关键词
区间值属性数据
数据挖掘
关联规则
聚类
Attribute data of interval value
Data mining
Association rule
Cluster