摘要
针对传统半监督支持向量机的高斯核函数无法恰当描述流形数据特性,从而导致流形数据分类精度下降的问题,提出一种基于谱聚类的聚类核半监督支持向量机.利用谱聚类方法在特征向量空间中对原始样本数据进行重新表述,使得在新表述中同一聚类中的样本能够更好地积聚在一起,构建聚类核函数,并进而构造聚类核半监督支持向量机,使样本更好地满足半监督学习必须遵循的聚类假设.研究结果表明:聚类核半监督支持向量机对未标记样本的分类精度高且算法性能稳定,对控制参数的设置不敏感,适于解决流形数据的分类问题.
The Gaussian kernel function used in traditional semi-supervised support vector machines(S3 VMs)cannot describe the characteristics of manifold data properly,which result in poor classification accuracy of manifold dataset.In order to improve the classification accuracy,a kind of cluster kernel S3 VM(CKS^3 VM)based on a spectral clustering method was proposed.The spectral clustering method was used to re-represent original data samples in a eigenvector space so as to make the data samples existed in one cluster can gather together much better.A cluster kernel function was constructed upon the eigenvector space and the CKS3 VM was designed further.The proposed CKS3 VM can make the data samples satisfy the cluster assumption that is key for successful semi-supervised learning.The experimental results show that the CKS3 VM has high classification accuracy on unlabeled samples,stable performance and not being sensitive to control parameters,which is applicable for the classification problem of manifold data.
出处
《中国矿业大学学报》
EI
CAS
CSCD
北大核心
2010年第6期886-890,共5页
Journal of China University of Mining & Technology
基金
国家自然科学基金项目(60804022
60974050
61072094)
教育部新世纪优秀人才支持计划项目(NCET-08-0836)
江苏省自然科学基金项目(BK2008126)
霍英东教育基金会青年教师基金项目(121066)
关键词
半监督学习
支持向量机
谱聚类
聚类核
流形数据
semi-supervised learning
support vector machine
spectral clustering
cluster kernel
manifold data