期刊文献+

基于手机终端的中文文本网页自动综述系统的研究

Automatic Summarization for Chinese Webpage Text Based on Mobile Phone Terminal
下载PDF
导出
摘要 总结了面向中文文本网页的文本综述的生成过程,详细分析了文本预处理、语句相似度计算、局部主题区域发现、差异性获取、综述生成等关键技术。在内容选择上,通过融合关键词和语句的内在特征进行相似度计算来考量语句的相关性;使用文本聚类技术来寻找语句的差异性。同时,基于MyEclipse环境的Java ME平台,结合其轻量级UI工具包LWUIT,使用WTK作为开发工具,设计并实现了基于手机终端的自动综述系统。最后选取了近200篇文献作为测试语料,进行了可接受性评测和基于Q&A的信息性评测,测试结果比较满意。 The generation process of multi-document automatic summarization for Chinese webpage text is summed up. Several key techniques are analyzed in detail involving text preprocessing, sentence similarity calculation, topic information and difference detection, and summarization generation. For content selection, on the one hand, it includes how to identify the important content by sentence similarity calculation based on inosculated inherent features about key words and sentence. On the other hand, it also includes how to find the differ- ences between sentences using text clustering. At the same time, on the basis of Java ME platform, combining with LWUIT, a mobile phone terminal based multi-document automatic summarization system by means of WTK is designed and implemented. Then nearly 200 articles are selected and the evaluating methods include quality and information evaluation based on Q&A. Finally the applying of this system gained comparatively satisfactory result.
出处 《计算机与数字工程》 2013年第6期943-946,995,共5页 Computer & Digital Engineering
关键词 文本综述 语句相似度 文本聚类 JAVAME LWUIT WTK multi-document automatic summarization sentence similarity text clustering Java ME LWUIT WTK
  • 相关文献

参考文献16

  • 1秦兵,刘挺,李生.多文档自动文摘综述[J].中文信息学报,2005,19(6):13-20. 被引量:51
  • 2宋锐,林鸿飞.基于文档语义图的中文多文档摘要生成机制[J].中文信息学报,2009,23(3):110-115. 被引量:6
  • 3J. Carbonell, J. Goldstein. The use of MMR, diversity-based reranking for reordering documents and producing summaries [C]//Proceedings of the 21st Annual International ACM SI GIR Conference on Research and Development in Information Retrieval, Melbourne, Australia,1998..335-336.
  • 4Radev D R, Jing H, Styg M, et al. Centroid-based summarization of multiple documents[J]. Information Processing and Management, 2004,40(6): 919-938.
  • 5Y. Ouyang, W. Li, S. Li, et al. Applying regression models to query-focused multi-document summarization[J]. Informa- tion Processing and Management,2011,47(2):227-237.
  • 6Ziheng Lin, Tat-Seng Chua, Min-Yen Kan, et al. NUS at DUC 2007.. Using Evolutionary Models of Text[C]//Proc of the Document Understanding Conference (DUC'07), Rochester, NY, USA,2007:1-8.
  • 7R. McDonald. A study of global inference algorithms in multidocument summarization[C]//Proceedings of 29th European Conference on IR Research, LNCS, vol. 4425, Springer-Verlag,2007:557-564.
  • 8D. Wang, T. Li, S. Zhu, et al. Multi-document summarization using sentence-based topic models[C]//Proeeedings of the ACL-IJCNLP 2009 Conference Short Papers, Singapore,2009 : 297-300.
  • 9贺瑞芳,秦兵,刘挺,潘越群,李生.基于宏微观重要性判别模型的时序多文档文摘[J].计算机研究与发展,2009,46(7):1184-1191. 被引量:4
  • 10Tingting He, Wei Shao, et al. The Implementation of a Query-directed Multi-Document Summarization System[C]//Advaneed Language Processing and Web Information Technology, 2007. ALPIT,2007:105-110.

二级参考文献77

共引文献74

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部