期刊文献+

基于关键名词短语聚类的中文搜索结果聚类 被引量:1

Chinese search result clustering based on key noun phrase clustering
下载PDF
导出
摘要 目前,搜索结果聚类方法大多数采用基于文档的方法,不能生成有意义的聚类标签。为了解决这个问题,提出一种基于关键名词短语聚类的中文搜索结果聚类方法,该方法将名词短语、相关搜索词作为候选聚类标签,利用C-Value算法、IDF值筛选标签,然后使用Chameleon算法将标签聚类,最后将搜索结果划分到最相关的聚类簇。实验证明,该方法把关键名词短语和相关搜索词作为聚类标签,有效地提高了标签的描述性,降低了聚类算法的时间复杂度。 Nowadays,the conventional search result clustering methods employ the document-based approach and can not generate clusters with highly readable names.To solve the problem,based on key noun phrase clustering,this paper proposes a method for Chinese search result clustering.First is to extract key phrases from search results,and use the phrases of correlative search as addition.Second is a new label selecting criterion based on C-Value algorithm and the value of IDF.The third is clustering the labels by Chameleon algorithm.Finally,the search result classification has been perfermed in terms of the results of label clustering.The experiment shows that using key noun phrases and the phrases of correlative search as clustering labels can improve the description of labels and reduce the computation complexity of clustering algorithm.
出处 《计算机工程与应用》 CSCD 北大核心 2009年第31期118-121,共4页 Computer Engineering and Applications
基金 国家高技术研究发展计划(863)No.2006AA010105 国家自然科学基金No.60772081 北京市属市管高校人才强教计划项目(No.PXM2007_014224_044677 No.PXM2007_014224_044676) 北京市教委科技发展计划项目(No.KM200710772010)~~
关键词 搜索结果聚类 关键名词短语抽取 C-Value算法 CHAMELEON算法 search result clustering key noun phrase extraction C-Value algorithm Chameleon algorithm
  • 相关文献

参考文献11

  • 1Toda H,Kataoka R.A search result clustering method using informatively named entities[C]//Proceedings of the 7th annual ACM International Workshop on Web Information and Data Management, 2005 : 81-86.
  • 2Hearst M A,Pedersen J O.Reexamining the cluster hypothesis: Scatter/gather on retrieval results[C]//Proceedings of the Nineteenth Annual International ACM SIGIR Conference,Zurich,June 1996.
  • 3钱功伟,倪林,田甜,曹荣.带聚类处理的元搜索引擎的设计与实现[J].计算机工程与应用,2007,43(22):182-185. 被引量:8
  • 4Osinski S,Stefanowski J,Weiss D.Lingo:Search results clustering algorithm based on singular value decomposition [C]//Advances in Soft Computing,lntelligent Information Processing and Web Mining, Proceedings of the International IIS : IIPWM' 04 Conference, Zakopane, Poland, 2004: 359-368.
  • 5Osinski S,Weiss D.Conceptual clustering using lingo algorithm:Evaluation on open directory project data[C]//Advances in Soft Computing,Intelligent Information Processing and Web Mining,Proceedings of the International IIS:IIPWM'04 Conference,Zakopane, Poland, 2004: 369-378.
  • 6Frantzi K,Ananiadou S,Mima H.Automaticrecognition of multi- word terms:the C-value/NC-value method[J].Int J Digit Libr, 2000(3):115-130.
  • 7Han Jiawei,Kamber M.数据挖掘:概念与技术[M].范明,孟小峰,译.北京:机械工业出版社,2007.
  • 8Silverstein C,Brin S,Motwani R.Beyond market baskets : Generalizing association rules to dependence rules[J].Datamining and Knowledge Discovery, 1998,2( 1 ) : 39-68.
  • 9Karypis G, Kumar V.hMETIS 1.5.3 : A hypergraph partitioning package[R].University of Minnesota,Department of Computer Science & Engineering,November 22,1998.
  • 10Karypis G, Han Eui -Hong, Kumar V.Chameleon : A hierarchical clustering algorithm using dynamic modeling[J].IEEE Computer, 1999,32(8) :68-75.

二级参考文献23

  • 1bbmao社会化搜索引擎[EB/OL].http://www.bbmao.com.
  • 2Mishra R K,Prabhakar T V.KhojYantra:an integrated metasearch engine with classification,clustering and ranking[C]//The 2000 Intemational Database Engineering and Applications Symposium,18-20 Sept 2000:122-133.
  • 3Abawajy J H,Hu M J.A new internet meta-search engine and implementation[J].Computer Systems and Applications,2005:103.
  • 4Meng Wei-yi,Liu King-Lup,Yu Clement.Estimating the usefulnessof search engines[C]//1%oceedings 15th International Conference on Data Engineering,23-26 March 1999:146-153.
  • 5Agrawal R,Srikant R.Fast algorithms for mining association rules in large databases[C]//IEEE International Conference on Systems.Man and Cybernetics,6-9 Oct 2002.
  • 6韩家炜,Kamber M.数据挖掘导论[M].北京:机械工业出版社,2001.
  • 7HttpClient[EB/OL].[2006-05-01].http://jakarta.apache.org/commons/httpelient/.
  • 8Jakarta ORO[EB/OL].[2005-12-01].http.//jakarta.apache.ors/oro/.
  • 9Lucene[EB/OL].(2006-02-27).http://lucene.apache.org,.
  • 10WebLucene[EB/OL].(2004-10-04).http://sourceforge.net/projects/weblucene/.

共引文献15

同被引文献10

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部