摘要
探讨对挖掘出的离群数据集进行解释与分析的有效方法。以粗糙集理论的属性约简技术为基础,定义了属性离群贡献度等概念对高维数据集离群特性进行了量化描述,提出了离群划分与离群约简思想以及离群数据关键属性域子空间分析方法,给出了一种离群约简算法并分析了算法复杂性。实验表明,这种方法可以有效地揭示离群数据产生来源,有助于对整体数据集的更全面理解,且提出的算法对于问题规模具有较好的适应性。
Some efficient methods of explaining and analyzing outliers is discussed in this paper.For describing outlying feature of high dimension dataset quantificationally,a concept of degree of outlying contribution is defined in the paper based on attribute reduction in the theory of rough set.With outlying partition and reduction and the analyzing method of the key attribute subspace of outliers are put forward,this paper presents an algorithm for outlying reduction and analyzes its complexity.Experimental results show that the approach can be used for identifying the origin of outliers a nd improve the understanding of whole data set and the proposed algorithm is scalable and efficient.
出处
《计算机工程与应用》
CSCD
北大核心
2006年第9期147-149,共3页
Computer Engineering and Applications
基金
国家自然科学基金资助项目(编号:60403009)
重庆市自然科学基金资助项目(编号:2005BB2224)
关键词
离群划分
关键域子空间
离群贡献度
离群约简
outlying partition,key attribute subspace,degree of outlying contribution,outlying reduction