摘要
针对传统支持向量聚类(support vector clustering,SVC)的高耗费和低性能弊端,提出了简约支持向量聚类算法(reduced support vector clustering,RSVC).RSVC的核心是简约策略和新的簇划分方法.前者是基于薛定谔方程而设计,提取对模型生成有重要意义的数据构成简约子集,并在此子集之上完成优化过程.后者提出并证明了高斯核函数特征空间的几何性质,并以此设计方法完成对数据簇的辨识任务.理论分析和实验结果表明,和同类算法相比,RSVC可更有效地解决两个弊端,在实际应用中取得良好的聚类效果.
Although with multi applications in data mining,fault diagnosis,bioinformatics and other aspects,the popularity of support vector clustering(SVC) algorithm is affected by two shortcomings:expensive computation and poor performance.Focus on such two bottlenecks,a novel algorithm,reduced support vector clustering(RSVC),is proposed.RSVC shares the frame of SVC,but it consists of reduction strategy and the new labeling approach.Reduction strategy is designed according to Schrdinger equation;it extracts those data that are important to model development to form a qualified subset,and optimizes the objective on this subset.The resulting clustering model has little loss in quality while consuming less cost.The new labeling approach is based on geometric properties of feature space of Gauss kernel function;it detects clusters by clustering support vectors and other data respectively in a clear way.The geometric properties are verified to guarantee the validation of the new labeling approach.Theoretical analysis and empirical evidence demonstrate that RSVC overcomes the two bottlenecks well and has advantage over its peers in performance and efficiency.And RSVC also exhibits fine behaviors.It shows that RSVC can work as a friendly clustering method in more applications.
出处
《计算机研究与发展》
EI
CSCD
北大核心
2010年第8期1372-1381,共10页
Journal of Computer Research and Development
基金
国家自然科学基金重点项目(60673099
60873146)
国家"八六三"高技术研究发展计划基金项目(2007AA04Z114
2009AA02Z307)
吉林省生物识别新技术重点实验室基金项目(20082209)
吉林大学"211工程"三期建设基金项目
"符号计算与知识工程"教育部重点实验室基金项目~~
关键词
支持向量聚类
简约策略
薛定谔方程
新的簇划分方法
特征空间几何性质
support vector clustering(SVC)
reduction strategy
Schrdinger equation
new labeling approach
geometric property of feature space