摘要
层次聚类分析是模式识别和数据挖掘领域中一个非常重要的研究课题,具有广泛的应用前景。受决策树学习中选择最佳分类属性的启发,提出一种引入信息增益的层次聚类方法,该方法利用信息增益指导层次聚类中的属性加权,从而提高聚类结果质量。在UCI数据集上的实验结果表明,该算法性能明显优于原层次聚类算法。
Hierarchical clustering analysis is a very important subject in the fields of pattern recognition and data mining, and has a broad application prospect. Inspired by the idea of selecting the best classification attributes in decision tree algorithm, a novel hierarchical clustering algorithm using information gain is proposed. This algorithm directs the attribute weighting in a hierarchical clustering by computing the information gains, thereby improving the quality of clustering results. The experiment results on UCI machine learning data sets indicate that it yields better stability compared with the quondam hierarchical clustering algorithm.
出处
《计算机工程与应用》
CSCD
2012年第1期142-144,共3页
Computer Engineering and Applications
基金
山东省科技研究计划项目(No.ZR2010FM021
2010G0020115
2008B0026)
山东省教育厅科研项目(No.J09LG02)
关键词
层次聚类
信息增益
属性加权
hierarchical clustering information gain attribute weighting