摘要
对话是自然语言处理的一个重要研究领域,其成果已经得到广泛的应用。然而中文对话模型训练时由于字词数量庞大,必然会面临模型复杂度过高的问题。为解决此问题,该文首先将对话模型的汉字输入转化为拼音输入并将拼音分为声母、韵母和声调三个部分,以此减小输入的字词数量。然后以嵌入编码的方法将拼音信息组合为图像形式,再通过全卷积神经网络(FCN)和双向Long Short Term Memory(LSTM)网络提取拼音特征。最后采用4层的Gated Recurrent Units(GRU)网络对拼音特征进行解码以解决长时记忆问题,得到对话模型的输出。在此基础上,模型在解码阶段加入了注意力机制,使模型的输出可以更好地与输入进行对应。为对提出的中文对话模型进行评价,该文建立了应用于医疗领域的中文对话数据库,并以BLEU和ROUGE_L为评价指标在该数据库上对模型进行了测试。
Conversation is an important research field in natural language processing with wide applications.However,when training the Chinese conversation model,we have to face the problem of excessively high model complexity due to the large number of words.To deal with this issue,this paper proposes to convert the Chinese input into Pinyin and divide it into initials,finals and tones three parts,thereby reducing the number of words.Then,the Pinyin information is combined into image form using embedding method.We extract the Pinyin feature through a Fully Convolutional Network(FCN)and a bi-directional Long Short Term Memory(LSTM)network.Finally,we use a 4-layer Gated Recurrent Units(GRU)network to decode the Pinyin feature for solving the problem of long time memory,and obtain the output of the conversation model.On this basis,the attention mechanism is added in the decoding stage so that the output can correspond with the input better.In the experiment,we set up a conversation database in the medical field,and use BLEU and ROUGE_L as an evaluation indicator to test our model on the database.
作者
吴邦誉
周越
赵群飞
张朋柱
WU Bangyu;ZHOU Yue;ZHAO Qunfei;ZHANG Pengzhu(Key Laboratory of System Control and Information Processing,Department of Automation,Shanghai Jiao Tong University, Shanghai 200240,China;College of Management Information System, Antai College of Economics and Management,Shanghai Jiao Tong University,Shanghai 200240, China)
出处
《中文信息学报》
CSCD
北大核心
2019年第5期113-121,共9页
Journal of Chinese Information Processing
基金
国家自然科学基金(91646205)
关键词
对话模型
拼音特征
注意力机制
conversation model
Pinyin feature
attention mechanism