期刊文献+

基于多Web信息源的主题概念网络获取 被引量:1

Acquiring Topical Concept Network from Multiple Web Information Sources
下载PDF
导出
摘要 Wikipedia一方面能够提供关于特定百科条目的概念性描述;另一方面,也通过分类系统将这些百科条目组织成一个概念网络.它对信息的广泛覆盖和有效组织使其成为了自动化知识获取的常用信息源.然而,仅仅依靠Wikipedia自身的信息,还不足以准确地刻画其内部概念间的关联性知识,而这是符号化知识表述的一个重要组成部分.因此,提出了一种基于多Web信息源的主题概念网络获取方法.它以Wikipedia的分类系统为基础,同时利用搜索引擎收集相关的Web信息作为关联性知识验证和发现的参照系,并通过集成信息检索和自然语言处理等领域的方法,实现了以给定的主题词为核心,在Wikipedia分类系统对应的概念网络中获取面向该主题的概念网络,同时网络内的概念间关系得到识别和标注.我们基于不同领域的主题词进行了实验,对实验结果的经验性评估展示了所获取的主题概念网络既能满足面向主题的要求,其内部的概念关联性知识又具备了一定的精度要求. Wikipedia provides conceptual description for specific entry and organizes these entries to form a concept category system. It has become a common information source for automatic knowledge acquisition. However, only relying on Wikipedia~ s information is not enough for acquiring the relationships between the concepts, while such relationships are one of the important components of symbolic knowledge representation. Other kinds of information sources are needed for this purpose. Therefore, we propose an approach for acquiring the relationships between the concepts from multiple Web information sources. These concept relationships will form a topical concept network. This approach conducts the following steps. First, based on a provided concept, named as the topical term, it obtains a group of concepts and the links between them from the Wikipedia category system. The concept group is centered on the topical term by some kind of relevance. Secondly, it exploits the search engine for collecting the related Web information as references for discovering and verifying the relationships between the concepts in the concept group by integrating different well-established methods in the information retrieval and natural language processing fields. Finally, it produces a topical concept network, in which the nodes concepts obtained in the first step and the edges are the relationships obtained in the second step. The experiments have been conducted on several topical terms from different domains and the results shows the feasibility and the effectiveness of the proposed approach.
出处 《计算机研究与发展》 EI CSCD 北大核心 2013年第9期1843-1854,共12页 Journal of Computer Research and Development
基金 国家自然科学基金重点项目(61232015)
关键词 Web信息源 主题概念网络 知识获取 信息检索 自然语言处理 Web data sources topical concept network~ knowledge acquisition~ information retrieval^natural language processing
  • 相关文献

参考文献25

  • 1McCarthyJ. Programs with common sense[CJ IIProc of the National Physics Lab. London: HMSO. 1958: 300-307.
  • 2Lenat D. Guha R. Building Large Knowledge-Based Systems: Representation and Inference in the CYC Project[M]. Boston: Addison-Wesley. 1989.
  • 3Miller A. WordNet: An online lexical database for English[J]. Communications of the ACM. 1995. 38(1]): 39-41.
  • 4Ponzetto S. Strube M. Deriving a large scale taxonomy from Wikipedia[CJ IIProc of the 22nd National Conf on Artificial Intelligence. Menlo Park. CA: AAAI Press. 2007: 1440- 1445.
  • 5Liu Kaipeng , Fang Binxing , Zhang Weizhe. Ontology emergence from folksonomies[CJ IIProc of the 19th ACM Int Conf on Information and Knowledge Management. New York: ACM. 2010: 1109-l118.
  • 6Na st ase V. Strube M. Transforming Wikipedia into a large scale multilingual concept network[J]. Artificial Intelligence. 2013.194(1): 62-8,5.
  • 7TangJ ie , Leung Hofung ? Luo Qiong , et al. Towards ontology learning from folksonomies[CJ IIProc of the 21st IntJoint Conf on Artificial Intelligence. Menlo Park. CA: AAAJ Press. 2009: 2089-2094.
  • 8Strube M. Ponzetto S. WikiRelatel Computing semantic relatedness using Wikipedia[C] IIProc of the 21st National Conf on Artificial Intelligence. Menlo Park. CA: AAAI Press , 2006: 1419-1424.
  • 9Gruber T. A translation approach to portable ontology specifications[J]. Knowledge Acquisition. 1993, 5 (2): 199- 220.
  • 10Bu Fan, Hao v?. Zhu Xiaoyan, Semantic relationship discovery with Wikipedia structure[CJ IIProc of the 22nd IntJoint Conf on Artificial Intelligence. Menlo Park. CA: AAAI Press, 2011: 1770-1775.

同被引文献12

  • 1Bengio Y,Ducharme R, Vincent P. A neural probabilistic language model[ J]. Journal of Machine Learning Research,2003,3(7) :1 137-1 155.
  • 2Michael U G, AapoHyvrinen. Noise-contrastive estimation of unnormalized statistical models,with applications to natural imagestatistics[ J] ? The Journal of Machine Learning Research,2012,13( 2) ;307-361.
  • 3Tomas M,Chen K,Corrado G. Efficient estimation of word representations in vector space[ EB/OL].( 2013-08-18) [ 2013-09-07]http : / / arxiv. org/ abs/1301.3781.
  • 4Bengio Y,LeCun Y. Scaling Learning Algorithms Towards AI [ M ]//Large-Scale Kernel Machines. Cambridge: MIT Press,2007.
  • 5Mikolov T, Karafi M, Burget L, et al. Recurrent neural network based language model [ C]//Proceedings of Interspeech.Chiba,Japan:MIT Press,2010: 131 -138.
  • 6Mikolov T,Ilya S,Kai C,et al. Distributed representations of words and phrases and their compositionality[EB/OL]. [2013-10-16]http:// arxiv.org/ abs/1310.4546.
  • 7Elman J. Finding structure in time[ J]. Cognitive Science, 1990,14(7) : 179-211.
  • 8Rumelhart D E, Hinton G E, Williams R J. Learning internal representations by back-propagating errors[ J]. Nature, 1986,323(9) :533-536.
  • 9Andriy M,Yee W T. A fast and simple algorithm for training neural probabilistic language models[ EB/OL] .(2009-10-12)[2012-06-10] http : / / arxiv. org/ftp/arxiv/papers/12061.
  • 10Frederic M, Yoshua B. Hierarchical probabilistic neural network language model [ C ] //Proceedings of the International Work-shop on Artificial Intelligence and Statistics. Barbados : MIT Press, 2005 : 246-252.

引证文献1

二级引证文献68

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部