期刊文献+

一种用于词性标注的相关投票融合策略 被引量:6

Correlation Voting Fusion Strategy Used for Part of Speech Tagging
下载PDF
导出
摘要 各种词性标注方法总是利用从某一侧面描述的语言学知识,当训练语料达到一定规模、训练模型完善到一定程度后,标注精度很难再有进一步的提高。本文在对TBED、DT、HMM和ME四种基于语料库的词性标注方法研究的基础上,提出了一种新的词性标注融合策略——相关投票法。从理论上分析了该方法的优越性,并与其他融合策略进行了对比实验。实验结果表明,应用融合策略可以更加全面地描述词性标注知识,从而更好地完成词性标注任务;在几种融合策略中,相关投票法是最优秀的,它使标注的平均错误率降低27.85%。 Part-of-speech (POS) tagging approaches always utilizes linguistic knowledge described from one perspective. Based on the research of four kinds of POS tagging methods, such as, TBED, DT, HMM and ME, we propose a novel data fusion strategy for POS tagging--- correlation voting method. The result of experiment shows that linguistic knowledge of POS tagging can be more roundly described by applying data fusion, and the correlative voting is better than other fusion methods for an average decrease of 27.85% in tagging error rate.
出处 《中文信息学报》 CSCD 北大核心 2007年第2期9-13,共5页 Journal of Chinese Information Processing
基金 国家自然科学基金资助项目(60372038)
关键词 人工智能 自然语言处理 词性标注 融合策略 相关投票 artificial intelligence natural language processing part of speech tagging fusion strategy correlationvoting
  • 相关文献

参考文献9

  • 1张民,李生,赵铁军,张艳风.统计与规则并举的汉语词性自动标注算法[J].软件学报,1998,9(2):134-138. 被引量:15
  • 2Eric Brill. A Corpus-Based Approach to Language Learning[D]. PhD Dissertation. University of Pennsylvania, 1993.
  • 3James Hammerton, Miles Osborne, Susan Armstrong, et al. Introduction to Special Issue on Machine Learning Approaches to Shallow OParsing[J]. Journal of Machine Learning Research 2, 2002, 551-558.
  • 4Eric Brill. Unsupervised Learning of Disambiguation Rules for Part of Speech. Natural Language [M]. Kluwer Academic Press, 1997.
  • 5Helmut Schmid. Probabilistic Part-of-Speech Using Decision[A]. In: Proceedings of International Conference on New Methods in Language Processing[C].1994. 44-49.
  • 6Thorsten Brants. TnT-A Statistical Part-of-Speech Tagger[A]. In: Proceedings of the 6th Applied Natural Language Processing Conference [C]. 2000.224-231.
  • 7Adwait Ratnaparkhi. A Maximum Entropy Model for Part-Of-Speech Tagging[A]. In: Proceedings of Conference on Empirical Methods in Natural Language Processing[C]. 1996. 132-142.
  • 8Chan P. K. and Stolfo S. J. A Comparative Evaluation of Voting and Meta-Learning of Partitioned Data[A].In: Proceedings of the 12th International Conference on Machine Learning[C]. 1995. 90-98.
  • 9Mitchel P. Marcus. Building A large annotated corpus of English: the Penn Treebank [J]. Communicational linguistics, 1993, 19(2) : 313-330.

二级参考文献5

  • 1Zhou Qiang,Chin Inf J,1996年,9卷,3期,1页
  • 2Zhang Chi,1996年
  • 3Zhou Ming,Proceedings of the NLPRS’95,1995年
  • 4赵铁军,Chin Inf J,1994年,7卷,4期,52页
  • 5Bai Shuanhu,硕士学位论文,1992年

共引文献14

同被引文献74

引证文献6

二级引证文献34

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部