摘要
针对数据挖掘中的文本自动分类问题,提出了一种基于支持向量机的分类方法。构造了可用于多个模式类识别的多层级连式SVM模型,该模型可完成对多个模式的分类识别。根据训练样本的分类体系完成对模型的构造之后,即可应用于实际文档的自动分类。文中给出了该模型的构造及应用的方法,用两种核函数作为内积回旋方案,以中国期刊网全文数据库部分文档数据为例验证了该方法的有效性。
Aiming to documents classification in data mining, a classification method based on the support vector machine is presented in this paper. The multi-layer linked SVM model that can classify the samples set into multi-categories is constructed. After it is constructed according to training samples set, the model can be applied to practical samples automatic classification. The method is shown about how the model is constructed and applied to classification. Two sorts of kernel function are applied to SVM, they are multinominal function and radial basis function. The model and algorithm are applied to some documents of China periodical document database and classification results illustrate its practicability.
出处
《情报学报》
CSSCI
北大核心
2006年第2期208-214,共7页
Journal of the China Society for Scientific and Technical Information
关键词
支持向量机
文本分类
机器学习
模式识别
support vector machine, document classification, machine learning, pattern recognition.