摘要
目前大多数智能聊天系统的实现主要有两种方式.检索式得到的回复准确且有意义,但回复内容和回复类型却受限于所选择的语料库.生成式可以获得语料库中没有的回复,更具灵活性,但是容易产生一些错误或是无意义的回复内容.为了解决上述问题,本文提出一种新的模型GRS(Generative-Retrieval-Score),此模型可以同时训练检索模型和生成模型,并用一个打分模块对检索模型和生成模型的结果进行打分排序,将得分最高的回复作为整个对话系统的输出,进而巧妙地将两种方法的优点结合起来,使最终得到的回复具体多样,且生成的回复形式灵活多变.在真实的京东智能客服对话数据集上的实验表明,本文提出的模型比现有的检索式模型和生成式模型在多轮对话建模上有着更优异的表现.
There are generally two ways to realize most intelligent chat systems: ① based on retrieval and② based on generation. The content and type of responses, however, are limited by the corpus chosen. The generative approach can obtain responses that are not in the corpus, rendering it more flexible;at the same time, it is also easy to produce errors or meaningless replies. In order to solve the aforementioned problems,a new model GRS(generative retrieval score) is proposed. This model can train the retrieval model and the generation model simultaneously. A scoring module is used to rank the results of the retrieval model and the generation model, and the responses with high scores are taken as the output of the overall dialogue system. As a result, GRS can combine the advantages of both dialogue systems and output a specific,diverse, and flexible response. An experiment on a real-world JingDong intelligent customer service dialogue dataset shows that the proposed model offers better outputs than existing retrieval and generation models.
作者
郭晓哲
彭敦陆
张亚彤
彭学桂
GUO Xiaozhe;PENG Dunlu;ZHANG Yatong;PENG Xuegui(School of Optical-Electrical and Computer Engineering,University of Shanghai for Science and Technology,Shanghai 200093,China)
出处
《华东师范大学学报(自然科学版)》
CAS
CSCD
北大核心
2020年第5期156-166,共11页
Journal of East China Normal University(Natural Science)
关键词
对话系统
数据中台
自然语言处理
智能电商客服
dialogue system
data center
natural language processing
intelligent e-commerce customer service