摘要
面向移动终端的自动摘要技术,对摘要本身提出了更严格的字数要求。本文设计了一个面向中文新闻领域的移动摘要系统,提取网页中的最大重复串作为文档的关键词集合,利用编辑距离生成适于在移动终端上显示的摘要。对于含有子标题的文档,采用层次型的摘要结构,以提高摘要的覆盖率,并用基于Q&A的评测方法验证了层次型摘要结构对该类文档的有效性。实验结果表明,生成的移动摘要在字数、可读性和完整性具有很好的效果。
Mobile oriented automatic summarization is restricted to summary length due to the smaller screens. In this paper, a Chinese news oriented mobile summary system was designed and implemented. After parsing the web news page, Maximally Repeated Strings were extracted as key words set. The summary displayed on the mobile ter minal was generated using Edit Distance. Considering some web pages were structured with subtitles, hierarchical summary was applied to them in order to improve the coverage of the summary. And then a Q&A based evaluation was designed to prove the effectiveness of this kind of summary. Experiment showed the summary created did well in conciseness, readability and coverage.
出处
《中文信息学报》
CSCD
北大核心
2008年第1期87-92,共6页
Journal of Chinese Information Processing
基金
国家自然科学基金资助项目(60373095
60673039)
国家863高科技计划资助项目(2006AA01Z151)
教育部留学回国人员科研启动基金资助项目
关键词
计算机应用
中文信息处理
移动摘要
最大重复串
编辑距离
层次型摘要
computer application
Chinese information oprocessing
mobile summarization
maximally repeatedstrings~ edit distance
hierarchical summarization