期刊文献+

基于加权Word2Vec和TextCNN的新闻文本分类 被引量:1

News text classification based on weighted Word2Vec and TextCNN
下载PDF
导出
摘要 随着网络和各类社交媒体的盛行,越来越多的文本信息通过互联网呈现在人们面前。对于海量的文本数据,自然语言处理技术变得越来越实用,新闻文本分类便是其中一项重要的任务,其对制定新闻检索策略、新闻推荐、社会舆情监控等具有积极作用。文章通过分析文本表示模型与分类模型的研究现状,提出一种基于加权Word2Vec和TextCNN的新闻文本分类方法,在新闻文本多分类数据上进行实验。从实验结果上来看,在文本表示模型中,该文方法比TF-IDF模型、Word2Vec模型以及随机词嵌入模型在精确率、召回率和F1值上均有提高;在文本分类模型中,文章使用的TextCNN模型要比传统的机器学习模型以及循环神经网络模型在分类效果以及模型性能方面表现更出色。 With the prevalence of the Internet and various types of social media, more and more text information through the Internet presented to people. For massive text data, natural language processing technology has become more and more practical, news text classification is one of the crucial tasks, which has a positive effect on the development of news retrieval strategies, news recommendations, social public opinion monitoring, etc. By analyzing the research status of text representation model and classification model, this paper proposed a news text classification method based on weighted Word2Vec and TextCNN, and experiments were performed on multi-classification data of news text. From the experimental results, in the text representation model, the proposed method is higher than the TF-IDF model, the Word2Vec model and the random word embedding model in terms of accuracy,recall rate and F1 value. In the text classification model, the Text CNN model used in this paper performs better in terms of classification effect and model performance than the traditional machine learning model and the recurrent neural network model.
作者 廖运春 舒坚 LIAO Yunchun;SHU Jian(School of Software,Nanchang Hangkong University,Nanchang 330063,China)
出处 《长江信息通信》 2022年第9期32-35,共4页 Changjiang Information & Communications
基金 国家自然科学基金项目(61762065)。
关键词 新闻文本分类 自然语言处理 文本表示 文本分类 Newstext classification Natural language process Text representation Text classification
  • 相关文献

参考文献4

二级参考文献3

共引文献138

同被引文献2

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部