期刊文献+

基于球向量机的中文文本分类 被引量:2

Chinese Text Classification Based on the Ball Vector Machine
下载PDF
导出
摘要 SVM在文本分类中的应用是近年来文本分类领域重要的进展之一。许多实验表明,SVM在文本分类中比其他的机器学习算法表现出更高的分类精度,但在大规模数据上的收敛速度较慢,成为SVM在实际应用中的一大缺点。球向量机是一种比SVM更快的机器学习方法。本文将BVM应用于文本分类。实验表明,BVM在文本分类中的应用具有与SVM相当的精度,而且比SVM有更少的训练时间。 In recent years, SVM (Support Vector Machine) for text classification has been regarded as one of the important progresses in the text classification field. Many experiments show that SVM has higher classification accuracy than any other machine learning algorithms in text classification, but it has a slower rate of convergence for large-scale data, which becomes a big flaw in its practice. BVM (Ball Vector Machine) is a faster machine learning algorithm than SVM. This paper applies BVM to text categorization. Experiments on real-world text data sets demonstrate that BVM has accuracies comparable to SVM, but is much faster than SVM.
出处 《计算机工程与科学》 CSCD 2008年第12期82-84,共3页 Computer Engineering & Science
关键词 文本分类 支持向量机 球向量机 text classification SVM BVM
  • 相关文献

参考文献7

  • 1Sebastini. Machine Learning in Automated Text Categorization[J]. ACM Computing Surveys, 2002, 34(1): 1-47.
  • 2Tsang W, Kwok J T, Cheung P-M. Core Vector Machines: Fast SVM Training on Very large Data Sets[J]. Journal of Machine Learning Researeh, 2005:363-392.
  • 3Tsang W, Kocsor A, Kwok J T. Simpler Core Vector Machines with Enclosing Balls[C]//Proc of the 24th Int'l Conf on Machine Learning, 2007.
  • 4谭松波,王月粉.中文文本分类语料库-TanCorpv1.0[EB/OL].(2007-08-29)[2008-01-20].http://www.searehforum:org.cn/tansongbo/corpus.htm.
  • 5ICTCLAS[CP/OL]. [2007-12-15]. http://www. nlp. org. cn/categories/default. php? cat_id= 12.
  • 6LibCVM[CP/OL]. [2007-12-15]. http://www. ese. ust. hk/-ivor/cvm. html.
  • 7LibSVM[CP/OL]. [2007-12-15]. http://www. csie. ntu. edu. tw/-cjlin/libsvm/.

共引文献10

同被引文献21

  • 1李盼池,许少华.支持向量机在模式识别中的核函数特性分析[J].计算机工程与设计,2005,26(2):302-304. 被引量:98
  • 2朱美琳,杨佩.基于支持向量机的多分类增量学习算法[J].计算机工程,2006,32(17):77-79. 被引量:11
  • 3余正涛,樊孝忠,郭剑毅,耿增民.基于潜在语义分析的汉语问答系统答案提取[J].计算机学报,2006,29(10):1889-1893. 被引量:44
  • 4李良俊,张斌,杨明.基于LSA降维的KNN文本分类算法[J].东北师大学报(自然科学版),2007,39(2):33-36. 被引量:7
  • 5Li C G. An efficient document categorization model based on LSA and BPNN//Proceedings of the Sixth International Conference on Advanced Language Processing and Web Information Technology. Washington, DC: IEEE Computer Society,2007.
  • 6Tsang W, Kocsor A, Kwok J T. Simpler core vector machines with enclosing balls//Proceedings of the 24^th international conference on Machine Learning, New York : ACM,2007 : 12 - 18.
  • 7Liu S, Liu Y K, Wang B, et al. An improved hyper - sphere support vector machine//Proceedings of the Third International Conference on Natural Computation. Washington, DC, USA: IEEE Computer Society, 2007.
  • 8SEBASTIANI F. Text categorization ; in alessandro zanasied, text mining and its application [ M ]. Southam-pton: WIT Press, 2005: 109-129.
  • 9DEERWETER S, DUNMAIS S T, FURNAS G W. Indexing by latent semantic analysis [ J]. Journal of the American Society for Information Science, 1990, 41 (6) : 391-407.
  • 10SALTON Q, WANG A, YANG C S. A vector space model for automatic indexing [ J ]. Communication of the ACM, 1975, 18 (11): 613-620.

引证文献2

二级引证文献12

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部