期刊文献+

一种文本分割技术的多文档文摘方法研究 被引量:2

ON AN MULTI-DOCUMENT ABSTRACT APPROACH BASED ON TEXT SEGMENTATION TECHNOLOGY
下载PDF
导出
摘要 提出一种基于文本分割技术的多文档自动文摘方法。该方法使用HowNet作为概念获取工具,通过建立句子概念向量空间模型和利用改进的DotPlotting模型来进行文本分割。利用建立的句子概念向量空间模型计算句子重要度,并根据句子重要度、文本分割结果和文摘句相似度等因素产生文本摘要。使用ROUGE-N评测方法和F_Score作为评测指标对系统产生的文摘进行评测,结果显示使用文本分割技术进行多文档摘要是有效的。 In this paper, we propose a multi-document auto-abstract method which is based on text segmentation technology. The method uses HowNet as the conceptual access tool, and segments the text through constructing the concept vector space model (CVSM) of sentence and using improved DotPlotting model. The CVSM of sentence is used to compute the importance degree of the sentences, and the text abstract is generated according to the factors of sentences importance degree, text segmentation result and the similarity of the sentences in abstract, etc. ROUGE-N evaluation method and F_Score are used as the evaluation index to evaluate the abstract formed by the system, the results show that to use text segmentation technology for multi-document abstract is effective.
出处 《计算机应用与软件》 CSCD 北大核心 2014年第9期40-44,共5页 Computer Applications and Software
基金 国家自然科学基金项目(90920005) 广西教育厅项目(201106LX873)
关键词 文本分割 自动文摘 HOWNET Text segmentation Auto-abstract HowNet
  • 相关文献

参考文献13

二级参考文献73

共引文献45

同被引文献14

  • 1苗守谦,卫志华.中文文本信息处理的原理与应用[M].北京:清华大学出版社,2007:109-150.
  • 2LUHN H P. The automatic creation of literature abs-Tracts[J]. IBM Journal of Research and Development,1958,2(2):159-165.
  • 3JIANG Changjin, PENG Hong, MA Qianli, et al. Au-to-matic Summarization for Chinese Text Based onCombined Words Recognition and Paragraph Clustering[C]//Proceedings of 2010 3rd International Symposiumon Intelligent Information Technology and Security In-formatics(IITSI), 2010:591-594.
  • 4ZHANG Peiying, LI Cunhe. Automatic text sum-marizeation based on sentences clustering and extraction[C]//Proceedings of 2nd IEEE International Confer-ence on Computer Science and Information Technology(ICCSIT), 2009:167-170.
  • 5NAOMI Daniel, DRAGOMIR Redav,Timothy Alli-son. Subevent based multi-document summarization[C]//Proceedings of HLT-NAACL workshop on textsummarization, 2003 : 9-16.
  • 6BRIN S. The Anatomy of a Large-Scale Hyper-textualWeb Search Engine [J]. Computer Networks and IS-DN Systems,1998,30(3) :1-7.
  • 7KLEINBERG J M. Authoritative Sources in a Hyper-linked Environment[J]. Journal of the ACM, 1998,46(5):604-632.
  • 8宋锐,林鸿飞.基于文档语义图的中文多文档摘要生成机制[J].中文信息学报,2009,23(3):110-115. 被引量:6
  • 9刘宗田,黄美丽,周文,仲兆满,付剑锋,单建芳,智慧来.面向事件的本体研究[J].计算机科学,2009,36(11):189-192. 被引量:96
  • 10刘茂福,李文捷,姬东鸿.基于事件项语义图聚类的多文档摘要方法[J].中文信息学报,2010,24(5):77-84. 被引量:6

引证文献2

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部