摘要
传统的基于知识库的词义消歧方法采用同一种类型知识(语义或共现关系)进行消歧,忽略了不同类型知识之间的互补作用.针对此问题,在传统的网络图词义消歧模型基础上,通过模型重构和对比实验,提出了一种基于异构关系网络图的词义消歧模型.该模型能够把多种类型的词义消歧知识有机融合到同一个网络图中,充分利用了多种知识协同消歧的优势.同时设计并实现了一种基于模拟退火的自动估计各种知识类型关系权重的方法,以最优化各种知识对消歧效果的影响.该方法是一种无监督的词义消歧方法,可以有效克服数据稀疏及知识获取瓶颈等问题.在SemEval-2007上的测试结果表明,该方法的消歧性能优于基线方法和目前参加该项评测的最好系统.
As one of the most important problems in natural language processing, word sense disambiguation (WSD) aims to identify the intended meaning (sense) of words in context. Traditional knowledge-based WSD methods usually leverage only one sort of knowledge (semantic or cooccurrence relationships) but ignore the complementarity between different types for disambiguation. To deal with this probIem, this paper proposes a novel WSD model using heterogeneous relation graph. Based on the reconstruction of traditional graph-based WSD model, different kinds of knowledge are naturally incorporated. Furthermore, since not all types of knowledge play an equally important role in WSD, an automatic parameter estimation method is designed and implemented to optimize the disambiguation effect by estimating the weight of various kinds of relations. The parameter estimation algorithm is adapted based on simulated annealing algorithm. The proposed WSD model is unsupervised. It can make full use of multi-source knowledge and alleviate the data sparseness and knowledge acquisition problems. The model is evaluated on a standard multilingual Chinese English lexical task (SemEval-2007), and the results indicate that the proposed method could significantly outperform the baseline method. Moreover, the proposed model also performs better than the best participating system in the evaluation.
出处
《计算机研究与发展》
EI
CSCD
北大核心
2013年第2期437-444,共8页
Journal of Computer Research and Development
基金
国家自然科学基金项目(61132009)
北京理工大学科技创新计划重大项目培育专项计划基金项目