摘要
聚类是数据挖掘领域中最活跃的研究分支之一,并在其他的科学领域也有广泛的应用。设计了基于加权快速聚类的异常数据挖掘算法,以便能快速发现异常数据。首先通过对数据的每个属性赋予一定权值,权值的大小要体现其对分类的贡献度,并根据属性权值的特点,选择比较优良的初始分区,然后进行多次迭代,得到接近最优分区,接着运用一定规则,发现异常数据类,最后实践证明该技术取得很好的社会效果。
Clustering is one of the most flourish direction of data mining,and it has been applied abroad at other scientific fields.This article promoted outlier data mining algorithms based on weighted fast clustering to inspect and deal with outlier data effectively.The processes of algorithms were described in the followings,firstly,the each property of data should be endowed with certain weight to incarnate its sort devotion degree,and choose better initialization subarea according to the weight characteristics of property,and get to the best subarea under many times iteration ,and then find outlier data by the application of certain data class.Finally,the experiment demonstrated this technology obtained better social effect.
出处
《计算机工程与应用》
CSCD
北大核心
2007年第35期153-155,共3页
Computer Engineering and Applications
基金
国家火炬计划(No.2004EB33006)
江苏省高校自然科学指导性计划项目(No.05JKD520050)。