摘要
针对传统文本分类模型提取中文短文本内在语义信息不够全面的缺点,提出了一种融合预训练模型和胶囊网络的文本分类模型。使用多尺度卷积神经网络提取预训练模型各层蕴含不同层次的局部语义,采用注意力机制融合得到多粒度局部语义和胶囊网络获取的全局语义,结合正则化方法提高模型对文本情感极性的判别能力。对比实验中模型在3个不同领域的真实数据集上的F1值,结果表明:模型利用改进的胶囊网络能够更加全面地提取中文短文本语义特征,提升情感极性判别精度。
In order to address the shortcomings of traditional text classification models in incomplete extracting the intrinsic semantic information of short Chinese texts,this paper proposes a text classification model that fuses pre-training models and capsule networks.A multi-scale convolutional neural network is firstly used to extract the local semantics in each layer of different levels of the pre-training model.After that,an attention mechanism is used to fuse the obtained multi-grained local semantics and the global semantics obtained through the capsule network,which is then combined with a regularization method to improve the discrimination ability of the model to the sentiment polarity of the text.Finally,the F1 values of the model in the experiment are compared with the real datasets in three different domains.The experimental results show that the model can extract the semantic features of the short Chinese texts more comprehensively by using the improved capsule network,which improves the accuracy of sentiment polarity discrimination.
作者
王东
李佩声
WANG Dong;LI Peisheng(College of Computer Science and Engineering,Chongqing University of Technology,Chongqing 400054,China)
出处
《重庆理工大学学报(自然科学)》
CAS
北大核心
2023年第5期178-184,共7页
Journal of Chongqing University of Technology:Natural Science
关键词
胶囊网络
情感分析
预训练模型
注意力机制
capsule network
sentiment analysis
pre-training model
attention mechanism