摘要
针对循环神经网络(RNN)结构在深层网络中收敛较慢和训练效果较差的问题,分析了长短期记忆(LSTM)和Highway网络的门结构特征,提出了一种将层间信息进行跨层连接的门结构单元(CIGU)。结合循环神经网络时间扩展的特点,通过设计层间门结构,使CIGU模型在空间上反向梯度下降时能够像LSTM在时间上传播一样具有长短期记忆能力,从而加强循环神经网络在空间上的深度学习能力。将设计的结构应用到LSTM中,并通过PTB语言数据集对不同的门结构进行训练和测试。结果表明:随着模型层数的加深,CIGU的训练收敛速度和测试结果比传统LSTM和基于Highway网络结构的LSTM显著提高。
Aiming at problem that recurrent neural network (RNN) structure has drawbacks of slow convergence and poor training effect in deep network, gated structural characteristics of long short-term memory ( LSTM ) and highway network are analyzed, and a cross-layer integration gated unit(CIGU) which connects information between layers is proposed. Combined with characteristics of time extension of RNN, by designing inter-layer gate structure, let CIGU model can have long and short-term memory abilities when it has backward gradient descent in space, which is similar to LSTM when propagating in time, so as to strengthens deep learning ability of RNN in space. The proposed structure is applied to LSTM, and train and test on different gate structures by PTB language data sets. Results show that with the deepening of model layers ,speed of training convergence and testing results of CIGU are significantly improved compared with traditional LSTM and LSTM based on highway network structure.
作者
余昉恒
沈海斌
YU Fang-heng;SHEN Hai-bin(Institute of Very Large Scale Integration circuit Design,Zhejiang University,Hangzhou 310027,China)
出处
《传感器与微系统》
CSCD
2018年第8期91-93,共3页
Transducer and Microsystem Technologies
基金
国家自然科学基金资助项目(61371032)
关键词
循环神经网络
长短期记忆
Highway网络
门结构
recurrent neural network(RNN)
long short-term memory(LSTM)
Highway network
gated unit