摘要
随着信息快速增长,如何从大量文档中提取摘要信息成为自然语言处理一个重要的研究方向。文章提出了一种不依赖于任何训练集和自然语言本身信息的自动摘要方法,该方法利用改进后的PageRank公式和HITS公式对文档所有句子打分排序,选取得分高的句子作为摘要。实验证明,该方法简单易行,具有高效性,良好的效果以及扩展性。
Because of massive increasing information, extracting summarization from documents is becoming an important research direction of nature language processing. This paper describes an automatic text summarization method that doesn't rely on any language-specific knowledge resources or any manually constructed training data. The method uses rank alorithms" basing on PageRank and HITS to score all sentences, then chooses some top sentences as summarization. Experiments proved that this simple method has high performance, good results and good scalability.
出处
《电脑与信息技术》
2009年第4期5-7,共3页
Computer and Information Technology