摘要
提出了一种基于自适应谐振理论(ART)网络的无指导中文名词短语共指消解方法。该方法充分利用名词短语自身特征,通过改变网络参数动态调节聚类数量,有效地解决了目前聚类共指消解中输出类别数目难以确定的难题。另外采用了一种基于信息增益率的特征选择方法,减少了区分度较弱特征给聚类所带来的干扰。该方法在保证了识别正确率的前提下,不依赖人工标注语料,可直接应用于跨领域的真实文本。最后在ACE中文语料上进行了相关实验,并取得了较好的结果。
This paper proposes a novel unsupervised approach for coreference resolution of Chinese based on adaptive resonance theory (ART) Networks. Through making full use of the characteristics of noun phrases and dynamically adjusting the parameters of the networks, the approach can solve the problem in the present clustering coreference resolution that the number of the output categories is hard to determine. Additionally, the approach performs a feature selection process based on the gain ratio criterion to reduce the noise created by the weak features in differentiation. The method scarcely depends on the hand-labeled corpus and can be directly applied to real texts in multiple fields while ensuring the accuracy. The experiment has shown its encouraging performance on ACE Chinese corpus.
出处
《高技术通讯》
EI
CAS
CSCD
北大核心
2009年第9期926-932,共7页
Chinese High Technology Letters
基金
国家自然科学基金(60575041)
863计划(2006AA01Z150)资助项目。
关键词
共指消解
无指导学习
自适应谐振理论(ART)
自然语言处理
coreference resolution, unsupervised learning, adaptive resonance theory (ART), natural language processing