期刊文献+

基于分层最大边缘相关的柬语多文档抽取式摘要方法 被引量:1

Khmer multi-document extractive summarization method based on hierarchical maximal marginal relevance
下载PDF
导出
摘要 为了解决传统多文档抽取式摘要方法无法有效利用文档之间的语义信息、摘要结果存在过多冗余内容的问题,提出了一种基于分层最大边缘相关的柬语多文档抽取式摘要方法。首先,将柬语多文档文本输入到训练好的深度学习模型中,抽取得到所有的单文档摘要;然后,依据类似分层瀑布的方式,迭代合并所有的单文档摘要,通过改进的最大边缘相关算法合理地选择摘要句,得到最终的多文档摘要。结果表明,与其他方法相比,通过使用深度学习方法并结合分层最大边缘相关算法共同获得的柬语多文档摘要,R1,R2,R3和RL值分别提高了4.31%,5.33%,6.45%和4.26%。基于分层最大边缘相关的柬语多文档抽取式摘要方法在保证摘要句子多样性和差异性的同时,有效提高了柬语多文档摘要的质量。 In order to solve the problem of ineffective utilization of the semantic information between documents in the traditional multi-document extractive summarization method and the excessive redundant content in the summary result,a Khmer multi-document extractive summarization method based on hierarchical maximal marginal relevance(MMR)was proposed.Firstly,the Khmer multi-document text was input into the trained deep learning model to extract all the single-document summaries.Then,all single document summaries were iteratively merged according to a similar hierarchical waterfall method,and the improved MMR algorithm was used to reasonably select summary sentences to obtain the final multi-document summary.The experimental results show that the R1,R2,R3,RL values of the Khmer multi-document summary obtained by using the deep learning method combined with the hierarchical MMR algorithm increases by 4.31%,5.33%,6.45%and 4.26%respectively compared with other methods.The Khmer multi-document extractive summarization method based on hierarchical MMR can effectively improve the quality of Khmer multi-document summary while ensuring the diversity and difference of the summary sentences.
作者 曾昭霖 严馨 余兵兵 周枫 徐广义 ZENG Zhaolin;YAN Xin;YU Bingbing;ZHOU Feng;XU Guangyi(Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming,Yunnan 650500,China;Yunnan Key Laboratory of Artificial Intelligence,Kunming University of Science and Technology,Kunming,Yunnan 650500,China;Yunnan Nantian Electronic Information Industry Company Limited,Kunming,Yunnan 650040,China)
出处 《河北科技大学学报》 CAS 2020年第6期508-517,共10页 Journal of Hebei University of Science and Technology
基金 国家自然科学基金(61562049,61462055)。
关键词 多文档摘要 文本输入 语义信息 最大边缘相关 深度学习 多冗余 抽取式 多样性 natural language processing Khmer extractive summarization deep learning waterfall method maximal marginal relevance(MMR)
  • 相关文献

参考文献4

二级参考文献47

  • 1李民权,吴先良.物流系统中集成化仓储系统设计[J].计算机工程,2006,32(24):266-268. 被引量:10
  • 2张建萍,刘希玉.基于聚类分析的K-means算法研究及应用[J].计算机应用研究,2007,24(5):166-168. 被引量:124
  • 3刘功中,李建华,李生红.基于类信息的特征选择和加权方法[C]//第一届全国信息检索与内容安全学术会议.上海:上海交通大学出版社,2004.
  • 4穗志方 俞士汶.基于骨架依存树的语句相似度计算模型[A]..中文信息处理国际会议论文集(ICCIP''98)[C].北京:清华大学出版社,1998.458-465.
  • 5Over, P and J. Yen. 2003. An Introduction to DUC 2003 - Intrinstic Evaluation of Generic News Text Summatization Systems. http :/www. nlpir, nist. gov/projeets/due/pubs/2003 slides/due2003 intro, pdf.
  • 6Saggion H., D. Radev, S. Teufel, and W. Lmn. 2002. Meta-Evaluation of Summarization in a cross-Lingual Environment Using-Based Metrics. In: Proceedings of COLING - 2002, Taipei.
  • 7Michael White, Tanya Korelsky, Claire Cardie, Vincent Ng, David Pierce and Kiri Wagstaff. Multidocument Summarizatien via Information Extraction[A]. In: Proceedings of the First International Conference on Human Language Technology Research[ C ]. 1998 : 36 - 44.
  • 8Minghui Wang and Hediheko Tanaka. Summarization of Multiple Chinese Technical Articles[A]. In: The First International Conference on Information[C]. Fukuoka, Japan. 2002:16- 19.
  • 9.[EB/OL].http://www-nlpir, nist. gov/projects/duc/index. html.,.
  • 10Chin-Yew Lin, Eduard Hovy. From Single to Multi-document Summarization: A Prototype System and its Evaluation[A]. In Proceeding of the 4Oth Anniversary Meeting of the Association for Computational Linguistics (ACL- 02)[ C ], Philadelphia, USA, 2002:25 - 34.

共引文献71

同被引文献8

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部