期刊文献+

改进层次聚类算法在文献分析中的应用 被引量:7

THE APPLICATION OF IMPROVED HIERARCHICAL CLUSTERING ALGORITHM TO ANALYZE LITERATURE
原文传递
导出
摘要 科技文献代表了科技发展的方向,对其分析有助于准确把握科技前沿.本文提出一种基于层次聚类的改进算法用于对科技文献进行聚类研究,以便识别科技文献所关注的创新设计方向.该算法通过观测不同距离条件下孤立点数目的变化情况,自动计算并判断层次聚类算法中所需的聚类终止条件.这样既避免了层次聚类算法中需要预先输入终止条件的不足,又保持了层次聚类算法聚类精度高的优点,且改进算法的复杂度和普通层次聚类算法的一致.运用上述改进算法对200篇文献进行聚类运算,与k-means算法的对比实验证明,改进层次聚类算法聚类效果良好,从而验证了该算法的可行性. Literatures indicate the development of technology.In the flied of techniques study, literature analysis is very helpful for research focus to keep ahead.It is a critical problem for the conventional clustering algorithm to get appropriate value parameters.To solve this problem,this paper proposes an improved hierarchical clustering algorithm to analyze literatures innovative field,which combines outlier detection with clustering.By regarding outliers as important information,the algorithm stops clustering process according to the outlier numerary transformation under different interval conditions. This algorithm keeps good qualities of clusters without additional parameters; meanwhile, its complexity is as same as the conventional algorithm. To verify the advantages of the improved algorithm, 200 literatures are adopted to evaluate the performance of the clustering algorithm, and the result of improved hierarchical clustering is better compared with k-means algorithm.
出处 《数值计算与计算机应用》 CSCD 北大核心 2009年第4期277-287,共11页 Journal on Numerical Methods and Computer Applications
基金 国家自然科学基金(50505017 50775111)资助项目
关键词 层次聚类算法 孤立点检测 创新设计 创新方向识别 hierarchical clustering outlier detection innovative design innovative field identify
  • 相关文献

参考文献15

  • 1Yanovsky V I. Citation Analysis Significance of Scientific Journals[J]. Scientometries, 1981, 3(3): 223-233.
  • 2Garfield E. Citation Indexing - Its Theory and Application in Science, Technology and Humani- ties[M]. New York: Wiley. 1979.
  • 3马楠,官建成.利用引文分析方法识别研究前沿的进展与展望[J].中国科技论坛,2006(4):110-113. 被引量:26
  • 4Agnieszka Nowak, Alicja Wakulicz-Deja. Intelligent Information Processing and Web Mining[M]. Springer Berlin, 2005.
  • 5Agnieszka Nowak, Alicja Wakulicz-Deja. The Concept of the Hierarchical Clustering Algorithms for Rules Based Systems. Intelligent Information Processing and Web Mining[M]. Springer Berlin, 2005.
  • 6袁方,周志勇,宋鑫.初始聚类中心优化的k-means算法[J].计算机工程,2007,33(3):65-66. 被引量:152
  • 7Sueli A. Mingoti, Joab O. Lima. Comparing SOM Neural Network with Puzzy c-means, K-means and Traditional Hierarchical Clustering Algorithms[J]. European Journal of Operational Research, 2006, 174: 1742-1759.
  • 8Hisashi Koga, Tetsuo Ishibashi, ToshinoriWatanabe. Fast Agglomerative Hierarchical Clustering Algorithm Using Locality-Sensitive Hashing[J]. Knowledge and Information Systems, 2007, 12(1): 25-53.
  • 9Pang-Ning Tan, Michael Steinbach, Vipin Kumar. Introduction to Data Mining[M]. Posts and Telecom Press, 2006.
  • 10Tian-yang Lv, Yu-hui Xing, Shao-bing Huang, etc. An Auto-stopped Hierarchical Clustering Algorithm for Analyzing 3D Model Database[J]. Knowledge Discovery in Databases, 2005 (3721): 601-608.

二级参考文献19

  • 1SMALL H.The relationship of information science to the social science:a co-citation analysis[J].Information Processing and Management,1980,17.39~50.
  • 2GONZALEZ F J,CASTRO B C.Dominant approaches in the field of management[J].International Journal of Organizational Analysis,2001,9(4).327~353.
  • 3PERSSON O,STERN P,HOLMBERG K G.BIBMAP-A toolbox for mapping the structure of scientific literature.In P.Weingart et al.(Eds.),Representations of science and technology[M].Leiden:DSWO Press,1992.
  • 4SALTON G,BERGMARK A.A citation study of computer science literature[J].IEEE Transactions on Professional Communication,1979,22.146 ~ 158.
  • 5MATHWORKS.Statisticstoolbox user's guide,version 3[M].Natick,MA:The Mathworks Inc,2000.
  • 6GARFIELD E.Citation Indexing-Its theory and application in science,Technology,and humanities[M].New York:Wiley.1979.
  • 7PRICE D D.Networks of scientific papers[J].Science,1965,149.510~515.
  • 8BURTON R E,KEBLER R W.The half-life of some scientific and technical literatures[J].American Documentation,1960,11.18 ~ 22.
  • 9SMALL H G.GRIFFITH,B.C.The structure of scientific literatures:I.identifying and graphing specialties[J].Science Studies,1974,4.17~40.
  • 10BRAAM R R,MOED H F,VAN RAAN A F J.Mapping of science by combined co -citation and word analysis,part Ⅰ:Structural aspects[J].Journal of the American Society of Information Science,1991,42(4).233 ~51.

共引文献176

同被引文献76

引证文献7

二级引证文献39

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部