
构建大规模的汉语事件知识库 被引量:2

Building a Large-scale Chinese Event Knowledge Base
摘要 该文提出了一种静态知识库和动态标注库相结合的汉语事件知识库构建方法。在统一的设计框架下,将相关事件知识拆分成五个相对独立的知识子库,并通过各子库之间的内在联系使之互相参照互为补充。经过有效拆分和信息联动,增强信息的丰富性和可靠性,同时细化工作的粒度,具有较好的可操作性。以此为基础,开发完成一个汉语"存在拥有类"事件知识库,其中静态知识库覆盖72个情境和1 548个词语义项,动态标注库包含598个事件目标动词的10万句标注结果,取得了较好的实验效果。 This paper proposes a solution to the construction of a large-scale Chinese event knowledge base.The static knowledge bases and dynamic annotated corpus are integrated to describe complete event content.In a unified framework,5 different sub-databases are partitioned and developed independently.They can be combined as whole event knowledge base through the build-in Key wordsamong them designed in advance.A demonstration knowledge base to describe Chinese existence and ownership events were built under this framework.Its static knowledge base covers 72 situations and 1548 word senses,and the dynamic annotated corpus contains 100,000 event chunk annotated sentences for 598 event target verbs.The experimental results prove the feasibility of the proposed method.
出处 《中文信息学报》 CSCD 北大核心 2012年第3期86-91,103,共7页 Journal of Chinese Information Processing
基金 国家自然科学基金资助项目(60873173) 国家高技术研究发展计划资助项目(2007AA01Z173) Tsinghua-Intel合作研究项目
关键词 事件内容分析 事件语义标注资源 汉语事件知识库 event analysis event annotation event knowledge base
  • 相关文献


  • 1Ruppenhofer J,Ellsworth M,Petruck M R L,et al.FrameNet II:Extended Theory and Practice[OL].http://framenet.icsi.berkeley.edu/.
  • 2Weischedel R,Pradhan S,Ramshaw L,et al.OntoNotes Release 4.0[OL].http://www.bbn.com/NLP/OntoNotes/.
  • 3北京大学汉语语言学研究中心.“广义拥有”与“领属变化”情境网络描述体系[R].技术报告.2009.
  • 4周强.汉语句法树库标注体系[J].中文信息学报,2004,18(4):1-8. 被引量:90
  • 5董振东,董强.知网[EB/OL].[2009-07-25].http://www.keenage.com/.
  • 6中国社科院语言研究所词典编辑室.现代汉语词典(修订本)[G].商务印书馆,1996.
  • 7鲁东大学中文信息处理研究所.目标动词义项标注规范6.0[R].技术报告,2009.
  • 8鲁东大学中文信息处理研究所.事件描述块句法语义标注规范6.0[R].技术报告,2009.
  • 9Doddington G,Mitchell A,Przybocki M,et al.Theautomatic content extraction(ace)program-tasks,data,and evaluation[C] //Proceedings of LREC.2004:837-840.
  • 10Palmer M,Gildea D,Kingsbury P.The propositionbank:A corpus annotated with semantic roles[J].Computational Linguistics.2005,31(1):71-106.


  • 1戴浩一.概念结构与非自主性语法:汉语语法概念系统初探[J].当代语言学,2002,4(1):1-12. 被引量:109
  • 2Brants, S., & Hansen, S. (2002). Developments in the TIGER annotation scheme and their realization in the corpus[A]. In: Proceedings of the Third Conference on Language Resources and Evaluation (LREC-02)[C]. Las Palmas de Gran Canaria, Spain. 1643-164
  • 3Collins, M. (1999) Head-Driven Statistical Models for Natural Language Parsing[D]. Ph.D. Thesis. Dept. of Computer Science and Information, The University of Pennsylvania.
  • 4Hajic, J. (1999). Building a syntactically annotated corpus: The Prague Dependency Treebank[A]. In: E. Hajicova (Ed.), Issues of valency and meaning. Studies in honour of Jarmila Panevova. Prague, Czech Republic: Charles University Press.
  • 5Chu-Ren Huang, Feng-Yi Chen, Keh-Jiann Chen, & al.(2000). Sinica Treebank: Design Criteria, Annotation Guidelines, and On-line Interface[A], Proceedings of the Second Chinese Language Processing Workshop[C], HongKong. 29-37.
  • 6Kingsbury, P.; Martha Palmer, and Marcus, M. (2002). Adding Semantic Annotation to the Penn TreeBank[A]. In: Proceedings of the Human Language Technology Conference[C], San Diego, California.
  • 7Leech, G.; and Garside, R. (1991). Running a grammar factory: The production of syntactically analysed corpora or ‘treebanks' [A]. In: Stig Johansson and Anna-Brita Stenstrom (eds.) English Computer Corpora: Selected papers and Research Guide. 1991. 15-3
  • 8Marcus, M., Kim, G., Marcinkiewicz, M.,& al. (1994). The Penn Treebank: Annotating predicate argument structure [A]. In: Proc. of the ARPA Human Language Technology Workshop[C]. San Francisco, CA.
  • 9Mitchell P.Marcus, Mary Ann Marcinkiewicz, and Beatrice Santorini (1993). Building a Large Annotated Corpus of English: The Penn Treebank[J], Computational Linguistics, 19(2):313-330.
  • 10Stephan Oepen, Dan Flickinger, Kristina Toutanova, et. al. (2002). LinGO Redwoods-A Rich and Dynamic Treebank for HPSG [A]. In: Proc. of First Workshop on Treebanks and Linguistic Theories (TLT2002) [C]. 139-149.



  • 1吕叔湘.大家来关心新词新义[J].辞书研究,1984(1):8-14. 被引量:97
  • 2亢世勇.《现代汉语新词语信息(电子)词典》的开发与应用[J].辞书研究,2001(2):55-63. 被引量:11
  • 3秦兵,刘挺,李生.多文档自动文摘综述[J].中文信息学报,2005,19(6):13-20. 被引量:51
  • 4Agrawal R, Gollapudi S, Halverson A, et al.Diversifying search results[C]//Proceedings of the 2nd ACM Interna- tional Conference on Web Search and Data Mining, New York,2009:5-14.
  • 5Guy I,Zwerdling N,Ronen I,et al.Social media recom- mendation based on people and tags[C]//Proceedings of the 33rd International ACM SI(31R Conference on Research and Development in Information Retrieval, Switzerland, 2010 : 194-201.
  • 6Miller G A.The WordNet project[EB/OL].[2012-12-27]. http ://wordnet.princeton.edu/.
  • 7董振东,董强.知网[EB/OL].[2013-03.20].http://www.keen.age.tom/.
  • 8Ruppenhofer J, Ellsworth M, Petruck M R L, et al. FrameNet II: extended theory and practice[EB/OL]. [2013-03-20].http ://framenet.icsi.berkeley.edu/.
  • 9Weischedel R, Pradhan S, Ramshaw L, et al.OntoNotes release 4.0[EB/OL].[2013-03-20].http://www.bbn.com/NLP/ OntoNotes/.
  • 10Passant A.Using ontologies to strengthen folksonomies and enrich information retrieval in weblogs[C]//Intema- tional Conference on Weblogs and Social Media, Boul- der, Colorado, 2007.









使用帮助 返回顶部