期刊文献+

基于fastText算法的行业分类技术 被引量:5

Industry classification technology based on fastText algorithm
下载PDF
导出
摘要 随着中国经济的高速发展和技术创新能力的不断提升,高效的组织、分类信息是提供个性化行业管理和跟踪分析的基础。根据行业信息特点和发展规律,提出了一种基于fastText算法的行业分类模型。首先,构建行业分类关键词库,通过特征词库进行分词和权重计算。然后,构建分类器模型,实现中文行业的自动分类。最后,实验选取了80000个包含企业经营范围、企业信息、舆论信息的测试文档,结果表明,所提模型结果高于Bayes、决策树、KNN等分类算法,取得了较好的应用效果。 With the rapid development of China's economy and the continuous improvement of technological innovation ability,efficient organization and classification information is the basis of providing personalized industry management and tracking analysis.According to the characteristics of industry information and the law of development,a Chinese industry classification model based on fastText is proposed in this paper.First,the keyword database of industry classification is constructed,then word segmentation and weight calculation are carried out by feature lexicon,and finally the classifier model is constructed to realize the automatic classification of industry.In the experiment,80000 test documents including business scope,enterprise information and public opinion information were selected.The results show that the classification accuracy of the proposed model is higher than that of Bayes,decision tree,KNN and other classification algorithms.Thus,the proposed model works well in the application.
作者 吴震 冉晓燕 苗权 刘纯艳 张栋 魏娜 WU Zhen;RAN Xiaoyan;MIAO Quan;LIU Chunyan;ZHANG Dong;WEI Na(National Computer Network Emergency Response Technical Team/Coordination Center of China,Beijing 100029,China;Beijing Branch of National Computer Network Emergency Response Technical Team/Coordination Center of China,Beijing 100055,China;Great Wall Computer Software&System Inc.,Beijing 100190,China)
出处 《北京航空航天大学学报》 EI CAS CSCD 北大核心 2022年第2期193-198,共6页 Journal of Beijing University of Aeronautics and Astronautics
关键词 自然语言处理 行业分类 fastText算法 关键词 语法模型 natural language processing industry classification fastText algorithm keywords grammar model
  • 相关文献

参考文献4

二级参考文献99

共引文献145

同被引文献88

引证文献5

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部