期刊文献+

一种改进的汉语全文无指导词义消歧方法 被引量:6

An Improved Word Sense Disambiguation Method for Chinese Full-words Based on Unsupervised Learning
下载PDF
导出
摘要 针对现存的基于EM(Expectation maximization)迭代的无指导词义消歧方法收敛缓慢、计算量大的问题,利用互信息和Z-测试结合的方法选取特征,并通过一种统计学习算法估算初始参数值.实验结果表明改进方法有效地提高了汉语词义消歧的准确率,具有良好的扩展性和实用性. The existing word sense disambiguation methods based on expectation maximization (EM) unsupervised learning need a large amount of computation and converge slowly. To address the problems, an improved method is proposed, which makes use of mutual information theory based on Z-test to select features and uses a statistical learning algorithm to estimate initial parameter values. The experimental result shows that the proposed method improves effectively the precision of word sense disambiguation and has good expansibility and practicability.
出处 《自动化学报》 EI CSCD 北大核心 2010年第1期184-187,共4页 Acta Automatica Sinica
基金 国家自然科学基金(60773100)资助~~
关键词 词义消歧 无指导学习 特征提取 参数估计 Word sense disambiguation, unsupervised learning, feature extraction, parameter estimation
  • 相关文献

参考文献11

  • 1Ide N, Veronis J. Word sense disambiguation: the state of the art. Computational Linguistics, 1998, 24(1): 1-41.
  • 2Lin S D, Kaxin V. A semantics-enhanced language model for unsupervised word sense disambiguation. In: Proceedings of the 9th International Conference on Computational Linguistics and Intelligent Text Processing. Haifa, Israel: Springer, 2008. 287-298.
  • 3McCarthy D, Koeling R, Weeds J, Carroll J. Unsupervised acquisition of predominant word senses. Computational Linguistics, 2007, 33(4): 553-590.
  • 4Pedersen T, Bruce R. Distinguishing word senses in untagged text. In: Proceedings of the 2nd Conference on Empirical Methods in Natural Language Processing. New York, USA: 1997. 197-207.
  • 5卢志茂,刘挺,李生.统计词义消歧的研究进展[J].电子学报,2006,34(2):333-343. 被引量:27
  • 6李涓子,黄昌宁.语言模型中一种改进的最大熵方法及其应用[J].软件学报,1999,10(3):257-263. 被引量:16
  • 7盛骤.概率论与数理统计.上海:上海交通人学出版社,1999.83-84.
  • 8Klein D. Unsupervised learning for natural language processing. In: Proceedings of the 21st Annual Conference on Learning Theory. Helsinki, Finland: Springer, 2008. 5-6.
  • 9Cai Ji-Hong, Song Fei. Maximum entropy modeling with feature selection for text categorization. In: Proceedings of the 4th Asia Information Retrieval Symposium. Harbin, China: Springer, 2008. 549-554.
  • 10卢志茂 刘挺 李生.面和基于统计骑汶语训义消歧模型[J].哈尔滨工业大学学报,2005,37(7):119-122.

二级参考文献61

  • 1宋余庆,罗永刚,孙志挥.应用主分量分析与粗糙集处理的特征提取[J].计算机工程与应用,2004,40(22):48-50. 被引量:7
  • 2卢志茂,刘挺,郎君,李生.神经网络和贝叶斯网络在汉语词义消歧上的对比研究[J].高技术通讯,2004,14(8):15-19. 被引量:9
  • 3黄昌宁,李涓子.词义排歧的一种语言模型[J].语言文字应用,2000(3):85-90. 被引量:16
  • 4陈彬,洪家荣,王亚东.最优特征子集选择问题[J].计算机学报,1997,20(2):133-138. 被引量:96
  • 5Nancy Ide and Jean Véronis.Introduction to the special issue on word sense disambiguation:The state of the art[J].In Computational Linguistics,1998,24(1):1-40.
  • 6H Schütze.Automatic word sense discrimination[J].Compu-tat ional Linguistics,1998,24(1):97-123.
  • 7董振东.HowNet[DB/OL].http://www.keenage.com.2002.
  • 8George A.Miller.(Ed.) WordNet:An on-line lexical database [J].International Journal of Lexicography,1990,3(4):235-312.
  • 9W A Gale,K W Church,D Yarowsky.Using bilingual materials to develop word sense disambiguation methods[A].Proceedings of the Fourth International Conference on Theoretical and Methodological Issues in Machine Translation[C].Montréal,Canada,1992.101-112.
  • 10David Yarowsky.Unsupervised word sense disambiguation rivaling supervised methods[A].In Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics[C].Cambridge,MA.1995.189-196.

共引文献50

同被引文献61

  • 1陈浩,何婷婷,姬东鸿.基于k-means聚类的无导词义消歧[J].中文信息学报,2005,19(4):10-16. 被引量:16
  • 2卢志茂,刘挺,李生.统计词义消歧的研究进展[J].电子学报,2006,34(2):333-343. 被引量:27
  • 3刘风成,黄德根,姜鹏.基于AdaBoost.MH算法的汉语多义词消歧[J].中文信息学报,2006,20(3):6-13. 被引量:7
  • 4魏伟.汉语离合词研究综述[J].锦州医学院学报(社会科学版),2006,4(4):80-83. 被引量:4
  • 5李娟子.汉语词义消歧方法研究[D].北京:清华大学,1999.
  • 6百度百科.古书注解[EB/OL].[2012-05-23].http ://baike. baidu. com/view/793424. htm#3.
  • 7Lesk M. Automatic Sense Disambiguation Using Machine Readable Dictionaries: how to tell a pine cone from an ice cream cone[ C ]// Proceedings of the 5th International Conference on Systems Documentation. Toronto Canada: ACM, 1986 : 24 - 26.
  • 8Manning C D, Schutze H. Foundations of statistical natural language processing [ M ]. Cambridge : The MIT Press, 1999 : 229 - 260.
  • 9Yarowsky D. Word-sense disambiguation using statistical models of Roger' s categories trained on large corpora[ EB/OL]. [ 2012 -05 -23 ]. http://www, informatik, uni -trier. de/~ ley/db/conf/coling/coling1992. html.
  • 10Ng H T and Lee H B. Integrating multiple knowledge sources to disambiguate word sense: An example based approach [ EB/OL]. [2012 -05 -23 ]. http://citeseerx. ist. psu. edu/showciting?cid = 4549.

引证文献6

二级引证文献19

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部