期刊文献+

基于Wiki链接结构图聚类的领域词典构建方法 被引量:7

Domain Thesaurus Construction Based on Wiki Hyperlink Structure Graph Clustering
下载PDF
导出
摘要 领域词典在信息检索、自然语言处理,以及问答系统等方面有着重要的应用.由于自然语言的复杂性,基于NLP的领域词典构建方法难以取得理想的结果.近年来Wiki百科得到了广泛的使用.Wiki不仅包含海量的文章,还拥有丰富的链接结构.基于超链接的锚描述性和主题局部性,提出一种基于有权无向链接结构图聚类的领域词典自动构建方法.该方法首先利用Wiki构建关于某特定领域的无向链接结构图,然后使用LSI算法和余弦相似度计算每条链接的权重,再利用CPMw算法对该有权无向链接结构图进行聚类,从而得到最终的领域词典.实验表明,本文提出的方法可以获得更好的领域词典构建结果. The domain thesaurus plays an important role in information retrieval, natural language processing, question answering system etc. Due to the complexity of the natural language, the NLP based thesaurus constructing methods are difficult to achieve a desired result. In recent years, Wild has been widely used as a knowledge base. Wild contains not only a large hum of articles, but also has a dense link structure. Based on the characteristics anchor description and topic locality of hyperlinks, this paper proposes a weighted undirected hyperlink structure graph clustering based domain thesaurus construction method. The method first constructs a domain-specific hypedink structure graph using Wild, and then uses LSI algorithm to calculate the weight of each hyperlink. Then the method uses CPMw algorithm to cluster the weighted undirected hyperlink structure graph. After this step, the domain thesaurus can be achieved. The experiments show that method proposed in this paper can get better results.
出处 《小型微型计算机系统》 CSCD 北大核心 2014年第6期1286-1292,共7页 Journal of Chinese Computer Systems
基金 国家科技支撑计划课题项目(2011BAH11B01)资助
关键词 领域典构建 WIKI CPMw LSI domain thesaurus construction Wild CPMw LSI
  • 相关文献

参考文献2

二级参考文献35

  • 1Gruber T R. A translation approach to portable ontology specifications [ J ]. Knowledge Acquisition, 1993, 5 (2) :199-220.
  • 2Noy N F, Fergerson R W, Musen M A. The knowledge model of protege-2000: Combining interoperability and flexibility [ C ] // Proceedings of the 12th International Conference on Knowledge Engineering and Knowledge Management ( EKAW2000 ). Heidelberg: Springer Verlag, 2000. 17-32.
  • 3Sure Y, Angele J, Erdmann M, et al. OntoEdit: Collaborative ontology engineering for the semantic Web [C] // Horrocks I, Hendler J A. Proceedings of the ISWC 2002. Heidelberg: Springer-Verlag, 2002: 221-235.
  • 4Perez G, Macho M. A survey of ontology learning methods and techniques [ J]. OntoWeb Deliverable D1, 2003 (5) :1-86.
  • 5Leuf B,Cunningham W. The Wiki Way:Quick Collaboration on the Web [ M ]. Boston, London: Addison WesleyPress, 2001.
  • 6Medelyan O, Milne D, Legg C, et al. Ming meaning from Wikipedia [ J ]. International Journal of Human Computer Studies, 2009 ( 9 ) : 716-754.
  • 7Johnson B. Wikipedia approaches its limits [ EB/OL]. [2009-09-05]. http://www, guardian, co. uk /technology/ 2009/aug/12/wikipedia-deletionist-inclusionist.
  • 8Corcho O, Gemez-Perez A. Evaluating knowledge representation and reasoning capabilities of ontology specification languages [ C/OL ]. [ 2009-09-01 ]// Proceedings of the ECAI 2000 Workshop on Applications of Ontologies and Problem-Solving Methods. http:// dia. fi, upm. es /- ocorcho/documents/ ECAI00 WS_CorchoGomezPerez. pdf.
  • 9Shamsfard M, Barforoush A A. Learning ontologies from natural language texts [ J ]. Int' 1 Journal Human- Computer Studies, 2004, 60( 1 ) : 17-63.
  • 10Patrick P, Lin D K. A statistical corpus-based term extractor [ C ] // Lecture Notes In Computer Science, Vol. 2056, Proceedings of the 14th Biennial Conference of the Canadian Society on Computational Studies of Intelligence: Advances in Artificial Intelligence. London, UK : Springer-Verlag, 2001:36-46.

共引文献107

同被引文献48

  • 1刘群,张华平,俞鸿魁,程学旗.基于层叠隐马模型的汉语词法分析[J].计算机研究与发展,2004,41(8):1421-1429. 被引量:198
  • 2孙霞,郑庆华,王朝静,张素娟.一种基于生语料的领域词典生成方法[J].小型微型计算机系统,2005,26(6):1088-1092. 被引量:11
  • 3陈文亮,朱靖波,朱慕华,姚天顺.基于领域词典的文本特征表示[J].计算机研究与发展,2005,42(12):2155-2160. 被引量:22
  • 4曲开社,翟岩慧.偏序集、包含度与形式概念分析[J].计算机学报,2006,29(2):219-226. 被引量:52
  • 5Zong Ziliang, Fares R, Romoser B, et al. FastStor: improving the performance of a large scale hybrid storage system via cac- hing and prefetching [ J ]. Cluster Computing, 2014,17 ( 2 ) : 593 -604.
  • 6Dr A K,Jayasudha S S. An efficient cluster based web object filters from web pre-fetching and web caching on web user navigation[J ]. International Journal of Computer Science Is-sues ,2012,9 ( 3 ) :483-489.
  • 7Liu Qinghui, Solis- Oba R. Web prefetching with machine learning algorithms[ C ]//Proc of international conference on internet computing. [s. 1. ]:[ s. n.] ,2008:142-148.
  • 8Wan Miao, Jsnsson A, Wang Cong, et al. Web user clustering and Web prefetching using random indexing with weight func- tions[J]. Knowledge and Information Systems,2012,33 (1): 89-115.
  • 9de la Ossa B A, Sahuquillo J, Pont A, et al. Key factors in web latency savings in an experimental prefetching system [ J ]. Journal of Intelligent Information Systems,2012,39 ( 1 ) : 187- 207.
  • 10Ban Zhijie,Wang Sansan. A framework of online proxy-based web prefetching [ J ]. Web Information Systems and Mining Lecture Notes in Computer Science,2012,7529:610-620.

引证文献7

二级引证文献47

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部