基于回归的抽取式摘要模型

Research on regression-based extractive summarization system

下载PDF

导出

摘要就是一个高度概括原文重要信息的过程。摘要算法大致可以分为2类:抽取式摘要和生成式摘要。抽取式摘要的目的是从原文中选择一些重要的短语或句子来组成摘要。生成式摘要是利用算法生成文本的另一种表达,所用到的词汇表述并不一定来自于原文。自动文本摘要能够帮助很多下游任务(例如新闻摘要,社会媒体等)。近些年一些基于神经网络的工作大都将抽取式摘要任务当成序列标注来建模。这就存在训练和测试的不一致性问题:训练时当成分类任务,测试时当成排序任务。研究提出一种基于神经网络的回归模型,让模型在训练的时候就直接拟合ROUGE得到其分数用来做排序。实验结果超过目前抽取式摘要的最好结果。 Automatic text summarization is the process of generating a concise representation of original text while retaining the core information. Summarization algorithms can be broadly classified into two categories: extractive and abstractive. Extractive approaches aim to select salient words,phrases or sentences from the original text while the abstractive methods focus on rewriting the content without the constraint of reusing words or phrases from the original text. Automatic summarization can aid many downstream applications( e. g.,news digests,social media). Recently,neural networks based data-driven approaches have become popular for modeling the extractive summarization task. A few recent approaches conceptualize extractive summarization as a sequence labeling task. Another problem is the discrepancy between training and testing,in which during the test time,it is treated as a ranking problem. Thus the paper presents a regression model to solve it. The proposed model learns to score sentences to fit ROUGE during the training. Experiment results show the proposed model outperforms than other extractive summarization systems.

作者赵怀鹏车万翔刘挺 ZHAO Huaipeng;CHE Wanxiang;LIU Ting(School of Computer Science and Technology,Harbin Institute of Technology,Harbin 150001,China)

机构地区哈尔滨工业大学计算机科学与技术学院

出处《智能计算机与应用》 2019年第2期200-203,207,共5页 Intelligent Computer and Applications

关键词神经网络抽取式摘要回归模型 neural networks extractive summarization regression model

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

1麦芷旋,黄梦影,梁卓思,刘岳林,周栋涎.“互联网+”背景下微店发展中存在的问题与对策分析[J].商业故事,2018(2):156-156.
2周健,田萱,崔晓晖.基于改进Sequence-to-Sequence模型的文本摘要生成方法[J].计算机工程与应用,2019,55(1):128-134. 被引量：13
3祝桂珍.学生暴力成因分析[J].法制博览,2017(23):259-259.
4李杰骏.统计学习与人工神经网络分类模型对比[J].电脑迷,2018(12):228-228.
5刘敬学,孟凡荣,周勇,刘兵.字符级卷积神经网络短文本分类算法[J].计算机工程与应用,2019,55(5):135-142. 被引量：22
6刘剑美.生态旅游视角下旅游业发展问题研究——以黄水镇为例[J].当代旅游,2018(12):21-21.
7王颖.国外个人商用服务信息保护和监管对我国的启示[J].时代金融,2018(33):83-84.
8宋化志,马于涛.DeepTriage:一种基于深度学习的软件缺陷自动分配方法[J].小型微型计算机系统,2019,40(1):126-132. 被引量：10
9宁静艳,俞晨,程年,刘芃.基于卷积神经网络的肺癌病理图像分类[J].软件导刊,2019,18(2):141-144. 被引量：3
10周志鹏,朱启涛.医药供应链利益相关者分析[J].物流科技,2019,42(2):152-156.

智能计算机与应用

2019年第2期

浏览历史

内容加载中请稍等...

基于回归的抽取式摘要模型

相关作者

相关机构

相关主题

浏览历史