摘要
多文档文摘是将同一主题下的多个文本描述的主要的信息按压缩比提炼为一个文本的自然语言处理技术。随着互联网上信息的日益丰富,多文档文摘技术成为新的研究热点。本文介绍了多文档文摘的产生和应用背景,阐述了多文档文摘和其他自然语言处理技术的关系,对多文档文摘国内外研究现状进行了分析,在此基础上汇总提出了多文档文摘研究的基本路线及关键技术,并总结了多文档文摘的未来及发展趋势。
multi-document summarization is a technology of natural languages processing, which extract important information from multiple texts about same topic according to ratio of compression, Multi-document summarization becomes new research spot with increasing of information in internet. In this paper,the background of multi-document summarization is introduced, the relationship with other technologies of natural language processing and the state of arts is analyzed, the key technologies and the methods of research of multi-document summarization are proposed. Finally, the feature of multi-document summarization is forecasted.
出处
《中文信息学报》
CSCD
北大核心
2005年第6期13-20,56,共9页
Journal of Chinese Information Processing
基金
国家自然科学基金重点资助项目(60435020)
关键词
人工智能
自然语言处理
多文档文摘
自然语言处理
文本压缩
artificial intelligence
natural language processing
multi-document slsmmarization
nature languages processing
compress of texts