融入显著性事件信息的标题生成方法被引量：1

Method on headline generation integrating salient event information

下载PDF

导出

摘要标题生成任务中,现有方法多以语句或短语为基本处理单元,利用单语句压缩技术或语句合成技术来生成最终的标题。这些方法或因语句过于稀疏而缺失篇章主要信息,或因短语合成缺乏语法规则约束而导致标题可读性差。提出了一种融入显著性事件信息的标题生成模型。该模型首先利用互增强原则学习显著性事件,并指导生成候选语句,然后根据这些候选语句构造词图,再结合路径显著性、流畅度,以及覆盖度等因素,设计相应的排名策略生成最终的标题。在标准评测集上的实验结果表明,提出的模型相对于目前主流的方法,取得了更好的性能。 In the headline generation task, previous methods always take sentence or phrase as the basic processing unit,and exploit single sentence compression or sentence synthesis technology to generate the headline. Unfortunately, the final outputs of these methods either discard some important information because of the sparseness of sentence, or have poor readability due to the lack of grammatical constraints to phrase synthesis. This paper proposes a method on headline generation integrating salient event information. The method firstly learns salient events through mutual reinforcement principle and selects candidate sentence with the guidance of these events. Then it constructs a word graph with these candidate sentences and designs a novel ranking strategy which takes the salience, fluency and coverage of path into account comprehensively to search the final headline. Experimental results on the benchmark dataset demonstrate the proposed model achieves better performance compared to the state-of-the-art systems.

作者杨冰孙锐姬东鸿 YANG Bing;SUN Rui;JI Donghong(School of Computer, Wuhan University, Wuhan 430072, China)

机构地区武汉大学计算机学院

出处《计算机工程与应用》 CSCD 北大核心 2016年第24期236-240,266,共6页 Computer Engineering and Applications

基金国家社科重大招标计划项目(No.11&ZD189) 国家自然科学基金面上项目(No.61373108)

关键词标题生成显著性事件多语句压缩互增强原则 headline generation salient event multi-sentence compression mutual reinforcement principle

分类号 TP391.1 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献4

1王志琪,王永成,刘传汉.基于互增强关系的自动文摘句子加权方法[J].上海交通大学学报,2007,41(8):1297-1300. 被引量：6
2ZHANG Qi QIU Xi-Peng HUANG Xuan-Jing WU Li-De.Learning Semantic Lexicons Using Graph Mutual Reinforcement Based Bootstrapping[J].自动化学报,2008,34(10):1257-1261. 被引量：3
3赵博,黄书剑,戴新宇,袁春风,黄宜华.基于分布内存的层次短语机器翻译并行化算法[J].计算机研究与发展,2014,51(12):2724-2732. 被引量：3
4刘凯鹏,方滨兴.一种基于社会性标注的网页排序算法[J].计算机学报,2010,33(6):1014-1023. 被引量：19

二级参考文献62

1Page Let al. The pagerank citation ranking: Bringing order to the web. Stanford University, Stanford, CA, USA: Technical Report 1999 -66, 1999.
2Kleinberg J M. Authoritative sources in a hyperlinked environment. Journal of the ACM, 1999, 46(5): 604 632.
3Koutrika Get al. Combating spamin tagging systems//Proceedings of the 3rd International Workshop on Adversarial Information Retrieval on the Web (AIRWeb' 07). Banff, Canada, 2007:57-64.
4Koutrika G et al. Combating spam in tagging systems: An evaluation. ACM Transactions on the Web, 2008, 2 (4): 1-34.
5Heymann P, Koutrika G, Garcia Molina H. Fighting spam on social web sites: A survey of approaches and future chal lenges. IEEE Internet Computing, 2007, 11(6) 36-45.
6Krause Bet al. The anti-social tagger: Detecting spam in social bookrnarking systems//Proceedings of the 4th International Workshop on Adversarial Information Retrieval on the Web(AIRWeb'08). Beijing, China, 2008:61-68.
7Hotho A et al. Information retrieval in folksonomies: Search and ranking. The Semantic Web: Research and Applications, 2006, 4011:411-426.
8Bao S et al. Optimizing web search using social annotations// Proceedings of the 16th International Conference on World WideWeb(WWW'07). Banff, Canada, 2007:501- 510.
9Noll M G et al. Telling experts from spammers: Expertise ranking in folksonomies//Proceedings of the 32nd Interns- tional ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR' 09). Boston, MA, USA, 2009:612 -619.
10Hofmann T. Probabilistic latent semantic indexing//Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'99). Berkeley, CA, USA, 1999. 50-57.

共引文献27

1胡军光,刘力,车奇.基于词性的文本挖掘算法在IDS日志中的应用[J].计算机与数字工程,2010,38(2):90-93. 被引量：2
2蒋昌金,彭宏,陈建超,马千里.基于主题词权重和句子特征的自动文摘[J].华南理工大学学报（自然科学版）,2010,38(7):50-55. 被引量：17
3刘兴林,郑启伦,马千里.一种基于主题词集的自动文摘方法[J].计算机应用研究,2011,28(4):1322-1324. 被引量：6
4李枫林,何洲芳.面向用户潜在信息需求的检索结果组织研究[J].情报理论与实践,2011,34(5):42-45. 被引量：4
5卫佳君,宋继华.自动文摘的方法研究[J].计算机技术与发展,2011,21(8):188-191. 被引量：3
6王健,李志斌,林鸿飞.一种基于社会化标注的网页检索方法[J].计算机工程,2012,38(15):50-52. 被引量：1
7廖涛,刘宗田,王先传.基于事件的文本表示方法研究[J].计算机科学,2012,39(12):188-191. 被引量：7
8廖志芳,李玲,刘丽敏,李永周.三部图张量分解标签推荐算法[J].计算机学报,2012,35(12):2625-2632. 被引量：17
9廖涛,刘宗田,王先传.基于事件的多主题文本自动文摘方法[J].计算机工程,2013,39(3):236-240. 被引量：1
10张玥,张宏莉,张伟哲,卢珺珈.识别网络论坛中有影响力用户[J].计算机研究与发展,2013,50(10):2195-2205. 被引量：11

同被引文献5

1李建江,崔健,王聃,严林,黄义双.MapReduce并行编程模型研究综述[J].电子学报,2011,39(11):2635-2642. 被引量：187
2刘鹏,滕家雨,丁恩杰,孟磊.基于Spark的大规模文本k-means并行聚类算法[J].中文信息学报,2017,31(4):145-153. 被引量：14
3海沫.大数据聚类算法综述[J].计算机科学,2016,43(S1):380-383. 被引量：38
4梁吉业,乔洁,曹付元,刘晓琳.面向短文本分析的分布式表示模型[J].计算机研究与发展,2018,55(8):1631-1640. 被引量：7
5孙昭颖,刘功申.面向短文本的神经网络聚类算法研究[J].计算机科学,2018,45(B06):392-395. 被引量：14

引证文献1

1卢献华,王洪俊.基于大数据计算框架的分布式新闻聚类系统设计[J].计算机科学,2019,46(S11):220-223. 被引量：9

二级引证文献9

1王军.基于大数据的网络舆情传播信息聚类监控研究[J].信息与电脑,2021,33(5):16-18.
2王敏静,王党利,赵美枝.基于人工智能的新闻大数据传播特征及挖掘系统设计[J].制造业自动化,2021,43(7):91-95. 被引量：2
3丁璇.基于大数据聚类的智能探测机器人运动控制系统设计[J].计算机测量与控制,2021,29(8):142-145. 被引量：2
4万倩,朱里越.面向海量新闻数据的舆情分析技术研究[J].广播电视信息,2021,28(10):93-97.
5薛晓璇.基于大数据聚类的用户画像提取与智能推送系统[J].电子设计工程,2022,30(2):184-188. 被引量：4
6卞悦旭,倪伟,王展旭.基于大数据聚类的移动机器人运动跟踪控制系统设计[J].计算机测量与控制,2022,30(4):86-90. 被引量：1
7孟小燕,赵希武.基于蚁群算法的计算引擎均衡部署数学建模[J].计算机仿真,2022,39(11):472-476. 被引量：1
8薛俊杰.智慧教育英语线上课程资源聚类系统设计[J].信息技术,2024,48(2):138-142.
9李洁,许青,张露露,王英明.基于模糊多目标决策的物联网大数据聚类算法[J].重庆科技学院学报（自然科学版）,2024,26(3):75-80.

1杨晓磊,盛帅.一种SQL注入攻击防御研究[J].科技传播,2013,5(13):222-222.
2哈焱.基于MC9S12XS128单片机的智能车路径识别探究[J].广东石油化工学院学报,2011,21(6):41-43. 被引量：4
3万海平,何华灿.基于谱图的维度约简及其应用[J].山东大学学报（理学版）,2006,41(3):124-127. 被引量：1
4黑嘿黑≯Super·Hei.浅谈PHP＋MYSQL注射语句构造——Okphp BBS v1.3代码安全分析[J].黑客防线,2004(06S):30-32.
5刘柱恒,叶贤良.基于GIS路径规划系统的研究与实现[J].电脑知识与技术,2009,5(8):6213-6214. 被引量：2
6王晓东,曹庆华,王卓.DB2数据库查询优化策略[J].现代电子技术,2006,29(10):92-95. 被引量：3
7施鹤远,彭凯,申正卫,李楠,李尹,王击.基于Cortex-M4的光电智能车路径识别最优化研究[J].计算技术与自动化,2014,33(2):112-114. 被引量：4
8刘增林.基于SQL的多条件选择智能实现算法[J].计算机光盘软件与应用,2011(12):143-143.
9陈芳.浅谈传统二维动画与三维动画技术的有机结合[J].通讯世界（下半月）,2015(12):300-301. 被引量：5
10王硕,尤枫,山岚,赵恒永.一种适用于专业搜索引擎的中文分词系统研究[J].计算机工程与应用,2008,44(19):142-145. 被引量：4

计算机工程与应用

2016年第24期

浏览历史

内容加载中请稍等...

融入显著性事件信息的标题生成方法被引量：1

参考文献4

二级参考文献62

共引文献27

同被引文献5

引证文献1

二级引证文献9

相关作者

相关机构

相关主题

浏览历史

融入显著性事件信息的标题生成方法 被引量：1

参考文献4

二级参考文献62

共引文献27

同被引文献5

引证文献1

二级引证文献9

相关作者

相关机构

相关主题

浏览历史

融入显著性事件信息的标题生成方法被引量：1