摘要
为解决支持向量机在分类识别前需要利用已知训练集进行训练的问题,本文提出了一种基于k均值的对无标识数据进行分类的支持向量机分类算法。首先利用k均值算法将未知数据划分成某个数量的子集,然后对新数据进行支持向量机训练得到决策边界与支持矢量,最后对无标识数据进行分类。模拟结果表明:训练时消耗的CPU时间为1.8280秒,支持向量个数为60时,分类错误率小于2%。
To solve the problem that the support vector machines (SVMs) must use a selected training set classified in advance, a SVMs classifier based on k-means clustering algorithm is presented for the classification of unlabeled data. The new algorithm is to firstly divide unlabeled data into many subsets with a new label by k-means clustering , then train the SVMs using the new data set to get decision boundary and support vectors, at last use the SVMs classifier to classify the unlabeled data. The simulations show that the classification error is less than 2% when the CPU training time is 1. 8280seconds and the number of support vector is 60.
出处
《青岛大学学报(自然科学版)》
CAS
2004年第4期44-48,共5页
Journal of Qingdao University(Natural Science Edition)
关键词
支持向量机
数据分类
K-均值
support vector machines
data classification
k-means clustering