摘要
人物属性抽取主要包括两个问题:属性识别和属性归属判定。属性识别主要是命名实体的识别,本文通过对分词软件的调整来完成;在属性归属判定中,本文突破目前主要在句子范围内进行统计操作的方式,提出以篇章知识为指导,从文本到句子逐级分类的人物属性抽取方法,该方法在CIPS-SIGHAN2014评测中F1值宽、严结果分别为0. 51与0. 49,为本次评测最好成绩。事实证明了该方法的有效性。
Personal attributes extraction mainly involves two aspects, attribute recognition and decision making on whether this attribute belongs to the extracted person.Personal attributes generally involve named entities recognition,which are realized by adjusting word segmentation software. Statistical analysis of attributes is currently made at the sentence level. By using the textual knowledge,we propose a personal attribute extraction method aiming at hierarchical classification from the text to the sentence. This method achieves 0.51 in the lenient evaluation results and 0.49 in the strict evaluation results of F1 Value in the CIPS-SIGHAN2014 Bakeoff respectively,which turns out to be the best. Therefore,it is proved that the method is effective.
作者
程南昌
邹煜
滕永林
侯敏
CHENG Nanchang;ZOU Yu;TENG Yonglin;HOU Min
出处
《语言文字应用》
CSSCI
北大核心
2019年第1期125-134,共10页
Applied Linguistics
基金
国家语委十三五重点项目(ZDI135-4)
国家社科基金项目(16BXW023)
国家社科基金教育学重点招标项目(AFA170005)的阶段成果
关键词
篇章知识
逐级分类
命名实体识别
属性归属判定
指代消解
textual knowledge
hierarchical classification
named entity recognition
attribute ownership decision
anaphora resolution