摘要
针对传统长短时记忆网络(long short-term memory,LSTM)在文本分类中无法自动选取最重要潜在语义因素的问题,提出一种改进的LSTM模型。首先,将传统LSTM的运算关系拓展为双向模式,使网络充分记忆输入特征词的前后关联关系;然后在输出层前面增加池化层,以便更好选择找到最重要的潜在语义因素。互联网电影资料库评论数据实验结果表明,该模型优于传统长短时记忆神经网络以及其他同类模型,揭示了改进方案对提高文本分类准确率是有效的。
Traditional long-short term memory network(LSTM)cannot automatically select the most important latent semantic factors in text categorization.To solve the problem,this paper proposes an improved LSTM model.First,the traditional LSTM operation relationship is extended to the bidirectional mode,so that the network fully remembers the context of the input feature words.Then,the pooling layer is added in front of the output layer to better select the most important latent semantic factors.The experiment on the Internet Movie Database review data show that the model is superior to the traditional long-short term memory neural network and other similar models,revealing that the improved scheme proposed in this paper can improve the accuracy of text classification.
作者
李建平
陈海鸥
LI Jianping;CHEN Haiou(School of Computer and Information Technology,Northeast Petroleum University,Daqing 163318,Heilongjiang,P.R.China)
出处
《重庆大学学报》
CAS
CSCD
北大核心
2023年第5期111-118,共8页
Journal of Chongqing University
基金
国家自然科学基金资助项目(61702093)。
关键词
自然语言处理
文本分类
循环神经网络
长短时记忆神经网络
natural language processing
text classification
recurrent neural network
long-short term memory neural network