期刊文献+

一种基于聚类算法的主旨句提取方法 被引量:1

A Method of Topic Sentences Extraction Based on Clustering Algorithm
下载PDF
导出
摘要 自动文本摘要中一个关键的步骤是确定文章的主旨并将反映文章主旨的句子提取出来。在讨论分析kmeans,k-medoids等聚类算法的基础上,根据对文本摘要的实际要求以及文档自身的特点,提出一种基于聚类算法的主旨句提取方法。实验结果表明,在提高聚类准确性的基础上,新方法较其他聚类算法能够更加有效地避免遗漏主题的问题,能较全方位地反映全文的主旨,提取出的摘要既覆盖全面又突出重点。 One of the most important steps in automatic summarization is to discover and extract the topic sentences. After comparing some clustering algorithms (such as k-means, k-medoids), according to the practical demands of summarization and the specialties of texts, improving methods are presented to avoid missing topics by improving the precision of clustering, for many articles do not have one topic. The summary by these methods can fully cover the article's topics.
出处 《情报学报》 CSSCI 北大核心 2008年第1期49-55,共7页 Journal of the China Society for Scientific and Technical Information
关键词 自动文本摘要 聚类算法 主旨句 文本单元 聚类中心 automatic summarization, clustering algorithm, thematic sentence, summary cell, clustering center
  • 相关文献

参考文献15

  • 1胡珀,何婷婷,姬东鸿.基于主题区域发现的中文自动文摘研究[J].计算机科学,2005,32(1):177-181. 被引量:5
  • 2杨善林,李永森,胡笑旋,潘若愚.K-MEANS算法中的K值优化问题研究[J].系统工程理论与实践,2006,26(2):97-101. 被引量:187
  • 3Yiu Ming Cheung.K-means:A new generalized k-means clustering algorithm[J].Pattern Recognition Letters,2003 (24):2883-2893.
  • 4Pavel Berkhin.Survey of Clustering Data Mining Techniques[R].Technical Report,Accrue Software,2002.
  • 5Chris Ding,Xiaofeng He.Cluster merging and splitting in hierarchical clustering algorithms[C]∥Proceedings of the 2002 IEEE International Conference on Data Mining.Maebashi City,Japan:Maebashi TERRSA,2002:139-146.
  • 6Hearst M A.Texttiling:Segmenting text into multi-paragraph subtopic passages[J].Computational Linguistics,1997,23(1):33-64.
  • 7刘远超,王晓龙,刘秉权.一种改进的k-means文档聚类初值选择算法[J].高技术通讯,2006,16(1):11-15. 被引量:23
  • 8Ding C,He X,Zha H,et al.A Min-Max cut algorithm for graph partitioning and data clustering[C]∥Proceedings of IEEE International Conference on Data Mining.San Jose,California,USA,2001:107-114.
  • 9傅向华,马兆丰,何明,冯博琴.一种个性化的主题提取和层次发现算法[J].西安交通大学学报,2005,39(2):119-122. 被引量:5
  • 10Chu S C,Roddick J F,Pan J S.Efficient k-medoids algorithms using multi-centroids with multi-runs sampling scheme[C]∥Hwang S Y,Srivastava J,Wang J H,Lim E P,eds.Proceedings of The International Workshop on Mining Data for CRM.Taipei,2002:14-25.

二级参考文献42

  • 1Hatzivassiloglou V, Klavans J L, Holcombe M L, et al.Simfinder: A flexible clustering tool for surmnarization. In: Proceedings of the NAACI, 2001 Workshop on Automatic Surrunarization, Pittsburgh, PA, 2001, 41-49 .
  • 2Jain A K,Dubes R C. Algorithms for clustering data. Englewood Cliffs NJ, USA: Prentice Hall, 1988.
  • 3Sneath P H, Sokal R R. Numerical Taxonomy. London, UK:Freeman. 1973.
  • 4King B. Step-wise clustering procedures. Journal of the Amercian Statistical Association , 1967, 69(8) :86-101.
  • 5Guha S, Rastogi R, Shim K. CURE: An efficient clustering algorithm for large databases. Information Systems, 2001, 26( 1 ) : 35-58.
  • 6Guha S, Rastogi R, Shim K. ROCK: a robust clustering algorithm for categorical attributes. In : Proceedings of the 15th International Cotfference on Data Engineering. Sydney: IEEE Computer Society Press, 1999. 512-521.
  • 7Karypis G, Han E H, Kumar V. Chameleon: A hierarchical clustering algorithm using dynamic modeling. IEEE Computer, 1999, 32(8) :68-75.
  • 8Han E H, Karypis G,Kumar V, et al. Clustering based on association rule hypergraphs. In: 1997 SIG-MOD Workshop on Research Issues on Data Mining and Knowledge Discovery, Tucson, Arizona, USA, 1997. 9-13.
  • 9MacQueen J B. Some methods for classification and analysis of multivariate observations. In : Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability, Berkeley: University of California Press, 1967. 281-297.
  • 10Yunjae J . Design and evaluation of clustering criterion for optimal hierarchicalagglomerative clustering:[PhD. Thesis].Minneapolis, Minnesota, USA: Department of Computer Science, University of Minnesota, 2001.

共引文献216

同被引文献14

引证文献1

二级引证文献11

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部