期刊文献+

基于局部与全局信息的自动文摘算法

Research of Automatic Summarization Based on Local and Global Information of Sentences
下载PDF
导出
摘要 采用平均特征词频率策略计算特征词权重,用快速n-grims算法对各特征词所处的概念体进行加权,用一种改进的K-means聚类算法进行段落聚类,提出一种基于局部与全局信息的自动文摘算法并给出算法评估。该算法不仅能够自适应获得k值,而且有效防止了初始点的随机选择对聚类结果的影响。评测结果表明该算法对经济类和科技类文章的准确率和召回率都明显高于新闻类和文学类文章,利用机器文摘进行分类的准确率明显高于使用原文本进行分类。该算法所得到的文摘,在各项指标上都优于传统方法生成的文摘。 The idea of our approach is to exploit both the local and global properties of sentences.In order to obtain local property,we use a term weighting scheme that employs average term frequency in a document as the normalization factor.And a fast algorithm for matching N-grams is uesd to optimize term weighting.The method can obtain an improved K-means method to cluster paragraphs,and discovers thematic areas according to clustering results.Furthermore,it integrates local and global property to produce summarization.And experiments do prove that it is feasible to use the method to develop a domain automatic abstracting system,which is valuable for further study.
出处 《广西科学院学报》 2007年第4期226-228,共3页 Journal of Guangxi Academy of Sciences
基金 国家自然基金项目(60673034) 2006年广西教育厅基金项目(149) 广西工学院博士 硕士基金项目资助
关键词 K-MEANS n-grims 段落聚类 自然语言理解 K-means,n-grims,paragraph clustering,natural language understanding
  • 相关文献

参考文献4

  • 1CALIFF M E,MOONEY R J.Relational learning of pattern match rules for information extraction:Agrawal R Proceedings of the 19th National Conference on Artificial Intelligence[C].New York:Holy Publishing Company,2003:87-90.
  • 2李蕾,钟义信,郭祥昊.全信息理论在自动文摘系统中的应用[J].计算机工程与应用,2000,36(1):4-7. 被引量:13
  • 3SINGHAL A,BUCKLEY C,MITRA M.Pivoted document length normalization:proceedings of the 19th annual international ACM-SIGIR conference on research and development in information retrieval SIGIR'96,ACM New York[C].New York:[s.n.],1996:21-29.
  • 4HILDA HARDY,NOBUYUKI SHIMIZU,TOMEK STRZALKOWSKI,et al.Cross-document summarization by concept classification:proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval SIGIR'02,ACM[C].New York:[s.n.],2002:121-128.

二级参考文献2

共引文献12

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部