摘要
语义信息在命名实体间语义关系抽取中具有重要的作用。该文以《同义词词林》为例,系统全面地研究了词汇语义信息对基于树核函数的中文语义关系抽取的有效性,深入探讨了不同级别的语义信息和一词多义等现象对关系抽取的影响,详细分析了词汇语义信息和实体类型信息之间的冗余性。在ACE2005中文语料库上的关系抽取实验表明,在未知实体类型的前提下,语义信息能显著提高抽取性能;而在已知实体类型的情况下,语义信息也能明显提高某些关系类型的抽取性能,这说明《词林》语义信息和实体类型信息在中文语义关系抽取中具有一定的互补性。
Semantic information plays an important role in the semantic relation extraction between named entities. Taking "TongYiCi CiLin" as an example, this paper systematically investigates the effectiveness of lexical semantic information on tree kernel-based Chinese semantic relation extraction, particularly the influence of different levels of semantic information and polysemy phenomenon, as well as details about the redundancy between lexical semantic information and entity type information. The experiments of relation extraction on the ACE2005 Chinese corpus shows that semantic information can significantly improve the extraction performance without entity types, while in the case of known entity types, semantic information can also noticeably enhance the extraction performance for some relation types. This implies a certain degree of complementarity between "CiLin" semantic information and en- tity type information in Chinese semantic relation extraction.
出处
《中文信息学报》
CSCD
北大核心
2014年第2期91-99,共9页
Journal of Chinese Information Processing
基金
国家自然科学基金(60873150
90920004)
江苏省自然科学基金(BK2010219
11KJA520003)
关键词
中文实体关系抽取
树核函数
同义词词林
语义信息
Chinese entity relation extraction
tree kernel
TongYiCi CiLin
semantic information