摘要
提出一种新的基于模糊聚类的组合分类器算法,该算法利用模糊聚类技术产生训练样本的分布特征,据此为每一个样本赋予一个权值,来确定它们被采样的概率,利用采样样本训练的分类器调整训练集的采样概率,依次生成新的分类器直至达到一定的精度。该组合分类器算法在UCI的多个标准数据集上进行了测试,并与Bagging和AdaBoost算法进行了比较,实验结果表明新的算法具有更好的健壮性和更高的分类精度。
A novel algorithm for the creation of classifier ensemble based on fuzzy clustering was introduced. The algorithm got the distribution characteristics of the training sets by fuzzy clustering and sampled different training dataset to train different individual classifiers. Then the algorithm adjusted every sample's weight to get more classifiers through evaluating the quality of the classifier until certain termination condition was satisfied. The algorithm was tested on the UCI benchmark data sets and compared with two other classical algorithms: AdaBoost and Bagging. Results show that the new algorithm is more robust and has higher accuracy.
出处
《计算机应用》
CSCD
北大核心
2008年第5期1204-1207,共4页
journal of Computer Applications
基金
山东省科技攻关计划项目(2005GG4210002)
山东省教育厅科技计划项目(J07YJ04)
山东省中青年科学家科研奖励基金资助项目(2006BS01020)
山东省高新技术自主创新工程专项计划项目(2007ZZ17)
关键词
分类器组合
模糊聚类
多样性
样本分布特征
classifier ensemble
fuzzy cluster
diversity
distribution character of training sample