期刊文献+

汉语形容词的自动词义区分研究 被引量:1

Researches on Word Sense Discrimination of Chinese Adjective
下载PDF
导出
摘要 词义知识获取是词义知识库建设、词义消歧等任务的基础和起点,目前该工作基本依赖人类专家的智慧和洞察力,在大规模文本处理上缺乏意义计算的客观性和一致性。该文以汉语的中高频形容词为样本,深入挖掘词义特征并采用有参数初始化过程的EM迭代算法,实现了从真实文本中自动发现并区分词语词义的过程。该词义区分算法选取易获取的词形特征、基于大规模语料的搭配特征、基于网络语料的属性—宿主关系特征,替代以往难以获取的句法结构特征,并进一步利用HowNet优化了词形特征的选择。该工作可以应用于信息检索等领域,能够对现有词典起到修改和补充的作用,该思路亦可扩展到其他汉语词类上去。 Lexieal knowledge acquisition is the bottleneck for many tasks like word sense disambiguation, lexieal knowledge base construction et al. This paper introduces an automatic word sense discrimination method for Chinese mid-high-frequency adjectives. We employ the EM algorithm and exploit the features of Chinese character, contextual bag-of-words and host-attribute pair instead of the more unreliable syntactic information. We further optimize the morphology selection by utilizing HowNet in our work. The experimental results show that word sense discrimination results are different from Chinese lexicons and could be used for lexicon modification and expansion even for other type of Chinese words.
出处 《中文信息学报》 CSCD 北大核心 2009年第6期19-25,共7页 Journal of Chinese Information Processing
基金 国家973课题资助项目(2004CB318102) 国家自然科学基金资助项目(60775031) 国家社科基金资助项目(08BYY060) 全国优秀博士学位论文作者专项资助项目(200514)
关键词 计算机应用 中文信息处理 知识获取 词义区分 特征选择 EM算法 computer application Chinese information processing knowledge acquisition word sense discrimination feature selection EM algorithm
  • 相关文献

参考文献13

  • 1Navigli R. Meaningful Clustering of Senses Helps Boost Word Sense Disambiguation Performance [C]// Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics, COLING-ACL, 2006: 105-112.
  • 2朱虹,刘扬.词汇语义知识库的研究现状与发展趋势[J].情报学报,2008,27(6):870-877. 被引量:4
  • 3Agirr E. and Soroa A. Evaluating Word Sense Induction and Discrimination Systems [C]//Proceedings of the 4th International Workshop on Semantic Evalua- tions (SemEval-2007), 2007: 7-12.
  • 4Schiitze H. Automatic Word Sense Discrimination[J]. Computational Linguistics, 1998, 24 ( 1 ): 97- 124.
  • 5Purandare A. and Pedersen T. Sense Clusters-Finding Clusters that Represent Word Senses [C]//Proceedings of 19th Conference on Artificial Intelligence (AAAI-04), San Jose, CA. 2004.
  • 6Niu, ZY. Ji, DH. Tan, CL. Learning word senses with feature selection and order identification capabilities [C]//Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics, Barcelona, Spain. 2004.
  • 7Pantel P. Lin DK. Discovering Word Senses from Text [C]//Proeeedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining. Edmonton, Canada. 2002: 613-619.
  • 8Fellbaum, C. WordNet - An Electronic Lexical Database [M]. MIT Press, 1998.
  • 9Velldal, E. A Fuzzy clustering approach to word sense discrimination [C]//Proceedings of the 7th International conference on Terminology and Knowledge Engineering, Copenhagen, Denmark. 2005.
  • 10Zhao Y. Karypis G. Hierachical Clustering Algorithms for Document Datasets [J].Data Mining and Knowledge Discovery, 2005, 10: 141-168.

二级参考文献42

共引文献5

同被引文献66

引证文献1

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部