期刊文献+

基于SVM的文本多选择分类系统的设计与实现 被引量:10

Design and Implementation of Chinese Web Page Multiple Choice Classification System Based on Support Vector Machine
下载PDF
导出
摘要 随着互联网的普及,人类获取特定信息需求的增加,如何快速获取特定类别信息是当前搜索引擎,门户网站等必须解决的问题。当前网页分类的任务都由机器学习的文本分类算法完成,但传统的机器学习分类方法基本没有考虑文本数据特征,提供无差别的分类服务。该系统充分考虑网页文本数据的特征,以文本标题为突破口实现快速分类以及依据SVM的普通分类。快速分类依据文本标题通过分词模型训练快速对应到分类标签上,完成快速分类。如果快速分类不成功则将文本内容通过结巴分词器分词,word2vec进行分词向量的训练,再根据分类要求通过SVM进行分类,完成普通的分类。通过提供两种不同的服务来完成不同的需求。 With the popularization of the Internet,the demand for specific information has increased,it is necessary to quickly obtain certain categories of information which must be solved by the current search engine,p ortal website and so on.Now,m any tasks of categorizing web pages are done by the text categorization algorithm of machine learning.However,the traditional categori zation method of machine learning does not take into account the characteristics of text data and provides the different categorization service.This system takes into account the features of web text data,and realizes the purpose of fast decision and general SVM cate gorization.Fast categorization based on text title by word segmentation model get fast training categorization label to complete catego rization.If the fast categorization is not successful,it can get Chinese text segmentation by jieba,get trained word segmentation vec tor by word2vec,and get categorization by SVM to meet the requirements of providing differentiated services.
作者 丁世涛 卢军 洪鸿辉 黄傲 郭致远 DING Shitao;LU Jun;HONG Honghui;HUANG Ao;GUO Zhiyuan(Wuhan Research Institute of Posts and Telecommunications,Wuhan 430074)
出处 《计算机与数字工程》 2020年第1期147-152,共6页 Computer & Digital Engineering
关键词 机器学习 标题 快速分类 word2vec SVM machine learning title fast decision word2vec SVM
  • 相关文献

参考文献7

二级参考文献42

共引文献467

同被引文献92

引证文献10

二级引证文献42

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部