摘要
标题生成任务中,现有方法多以语句或短语为基本处理单元,利用单语句压缩技术或语句合成技术来生成最终的标题。这些方法或因语句过于稀疏而缺失篇章主要信息,或因短语合成缺乏语法规则约束而导致标题可读性差。提出了一种融入显著性事件信息的标题生成模型。该模型首先利用互增强原则学习显著性事件,并指导生成候选语句,然后根据这些候选语句构造词图,再结合路径显著性、流畅度,以及覆盖度等因素,设计相应的排名策略生成最终的标题。在标准评测集上的实验结果表明,提出的模型相对于目前主流的方法,取得了更好的性能。
In the headline generation task, previous methods always take sentence or phrase as the basic processing unit,and exploit single sentence compression or sentence synthesis technology to generate the headline. Unfortunately, the final outputs of these methods either discard some important information because of the sparseness of sentence, or have poor readability due to the lack of grammatical constraints to phrase synthesis. This paper proposes a method on headline generation integrating salient event information. The method firstly learns salient events through mutual reinforcement principle and selects candidate sentence with the guidance of these events. Then it constructs a word graph with these candidate sentences and designs a novel ranking strategy which takes the salience, fluency and coverage of path into account comprehensively to search the final headline. Experimental results on the benchmark dataset demonstrate the proposed model achieves better performance compared to the state-of-the-art systems.
作者
杨冰
孙锐
姬东鸿
YANG Bing;SUN Rui;JI Donghong(School of Computer, Wuhan University, Wuhan 430072, China)
出处
《计算机工程与应用》
CSCD
北大核心
2016年第24期236-240,266,共6页
Computer Engineering and Applications
基金
国家社科重大招标计划项目(No.11&ZD189)
国家自然科学基金面上项目(No.61373108)
关键词
标题生成
显著性事件
多语句压缩
互增强原则
headline generation
salient event
multi-sentence compression
mutual reinforcement principle