摘要
提出一种适用于短文本分类的多基模型框架Bagging_fastText(B_f)。它是一种基于自举汇聚法的快速文本分类算法的框架。以fastText为基模型,运用集成学习思想,设置最优超参数并训练出多个基模型组成多基模型,再通过投票机制获取最终类别。对商品名称短文本分类的实验结果表明,提出的B_f比fastText、朴素贝叶斯传统文本分类算法、文本卷积神经网络(TextCNN)算法的分类效果更优。
This paper proposes a multi-base model framework for short text classification,Bagging_fastText(B_f).It is a framework of fast text classification algorithm based on Bootstrap aggregating method.It used fastText as the base model,used ensemble Learning idea,set optimal hyperparameters and trained multiple base models to form multi-base model,and then the final classification was obtained by voting mechanism.The experimental results of short text classification of product names show that the proposed B_f has better classification effect than the fastText,naive Bayesian traditional text classification algorithm,and TextCNN(text convolutional neural network)algorithm.
作者
沈雅婷
左志新
Shen Yating;Zuo Zhixin(ZiJin College,Nanjing University of Science and Technology,Nanjing 210023,Jiangsu,China)
出处
《计算机应用与软件》
北大核心
2021年第2期185-190,共6页
Computer Applications and Software
基金
江苏省高校自然科学研究项目(19KJB520039)
南京理工大学紫金学院科学研究项目(2019ZRKX0401008)。