摘要
为了解决传统的支持向量机(SVM)多类分类方法中普遍存在的训练和测试时间过长、实际样本输入空间非线性可分等问题,提出了一种改进的支持向量机多类分类方法。首先,利用Mercer核函数把实际输入样本向量空间中呈非线性可分分布的样本向量映射到一个高维的特征向量空间,以实现线性可分;然后,采用二叉树来创建高维特征空间中的SVM多类分类器,实现分类识别;最后,将该方法应用到网络文本分类中。实验结果表明,该方法较好地解决了多类文本分类中实际输入文本向量呈非线性可分的问题,降低了训练和测试过程中的时间消耗,且在一定程度上提高了多类文本分类的识别准确率。
In order to solve the problem of long time for training and testing,input space nonlinear classification in traditional SVM meth-ods,propose an improved text classification method for SVM. First,the Mercer kernel is used to sample vector with nonlinear distribution in input space,which is mapped into a high dimensional feature vector space to realize the linear separable. Then it constructs the SVM multi-class classifiers in high dimensional space using binary tree,implementing the classification recognition. Finally,use this method in-to the network text classification. The experimental results show that the method can effectively solve the nonlinear separable problem in input text space,saving the training and testing time,and improving the precision of multi-text classification.
出处
《计算机技术与发展》
2015年第5期78-82,共5页
Computer Technology and Development
基金
2012年湖北省教育科学技术研究计划指导性项目(B20128103)