期刊文献+

词性标注的方法研究——结合条件随机场和基于转换学习的方法进行词性标注 被引量:4

Research on the Part-of-Speech Tagging Method——POS Tagging using Conditional Random Fields and Transformation Based Learning
下载PDF
导出
摘要 词性标注是语料库建设中的重要环节,同时也是自然语言处理领域的基础研究课题。针对词性标注任务中统计处理和规则处理两种方法各自的特点和局限,提出融合条件随机场模型和基于转换学习的方法来进行自动词性标注的方案。实验结果表明,该方案能够有效地提高词性标注的正确率。 POS tagging is an important part of corpora building and a basic research in the field of NLP. After comparing the advantage and weakness of the rule - based methods and the statistical methods, an automatic POS tagging method based on both CRF and TBL is presented. And the tests prove that the method can improve the accuracy of words tagging.
出处 《现代图书情报技术》 CSSCI 北大核心 2009年第3期46-51,共6页 New Technology of Library and Information Service
基金 国家科技支撑计划"多语言信息服务环境关键技术研究与应用"(项目编号:2006BAH03B02) 中国科学技术信息研究所学科建设课题基金项目"语言技术与知识技术"(项目编号:2008DP01-9)的研究成果之一
关键词 词性标注 条件随机场 基于转换 错误驱动 POS tagging CRF TBL Error - driven
  • 相关文献

参考文献9

  • 1Daniel Jurafsky,James H.Martin.自然语言处理综述[M].冯志伟,孙乐译.北京:电子工业出版社,2005.
  • 2Lafferty J,McCallum A,Pereira F.Conditional Random Fields:Probabilistic Models for Segmenting and Labeling Sequence Data[C].In:Proceedings of the 18th International Conf on Machine Learning.San Francisco:AAAI Press,2001:282-289.
  • 3Sutton C,MeCallum A.An Introduction to Conditional Random Fields for Relational Learning[A]//se Getoor and Ben Taskar.Introduction to Statistical Relational Learning[M].Maryland,MIT Press,2006.
  • 4Hanna Wallach.Efficient Training of Conditional Random Fields[C].In:Proc.6th Annual CLUK Research Colloquium,2002.
  • 5Florian R,Ngai G.Fast Transformation-based Learning Toolkit[EB/OL].[2008-09-10].http://p.cs.jhu.edu/~ rflorian/fntbl/documentation.html.
  • 6Brill.Transformation-based Error-driven Learning and Natural Language Processing:A Case Study in part of Speech Tagging[J],Computational Linguistics,1995 (21):543-565.
  • 7王蕾,朱巧明,李培峰,杨季文.基于实例和错误驱动的规则学习方法及其应用[J].计算机应用与软件,2008,25(1):162-164. 被引量:1
  • 8李鑫,黄萱菁,吴立德.基于错误驱动算法组合分类器及其在问题分类中的应用[J].计算机研究与发展,2008,45(3):535-541. 被引量:19
  • 9肖忠华.兰开斯特汉语语料库[EB/OL].[2008-11-05].http://ling.cass.cn/dangdai/LCMC/LCMC.htm.

二级参考文献17

  • 1陈文亮,朱靖波,吕学强.词性标注规则的获取和优化[J].术语标准化与信息技术,2004(2):23-26. 被引量:5
  • 2孙宏林,俞士汶.浅层句法分析方法概述[J].当代语言学,2000,2(2):74-83. 被引量:38
  • 3Xin Li, Dan Roth. Learning question classifiers [C]. The 19th Int'l Conf on Computational Linguistics, Taipei, 2002.
  • 4Xin Li, Dan Roth, Kevin Small. The role of semantic information in learning question classifiers [C]. The 1st Int'l Joint Conf on Natural Language Processing, Sanya City, China, 2004.
  • 5D Moldovan, Marius Pasea, Sanda Harabagiu. Performance issues and error analysis in an open-domain question answering system [J]. ACM Trans on Information Systems, 2003, 21 (2): 133-154.
  • 6Dell Zhang,Wee Sun Lee. Question classification using support vector machines [C]. ACM SIGIR Conf on Research and Development in Information Retrieval, Toronto, Canada, 2003.
  • 7Abraham lttycheriah, M Franz, S Roukos. IBM' s statistical question answering system-TREC-10 [ C]. The Text Retrieval Conf, Gaithersburg, MD, 2001.
  • 8Eduard Hovy, Ulf Hermjakob, Chin-Yew Lin, et al. Using knowledge to facilitate factoid answer pinpointing [C]. The COLING-2002 Conf, Taipei, 2002.
  • 9George Miller, James S, McDonnell. Wordnet 2.0 [R]. Princeton University's Cognitive Science Laboratory, Tech Rep, 2003.
  • 10Dekang Lin. Dependency-based Evaluation of MINIPAR [C]. Workshop on the Evaluation of Parsing Systems Granada, Spain, 1998.

共引文献18

同被引文献27

  • 1洪铭材,张阔,唐杰,李涓子.基于条件随机场(CRFs)的中文词性标注方法[J].计算机科学,2006,33(10):148-151. 被引量:56
  • 2周蕾,朱巧明.基于统计和规则的未登录词识别方法研究[J].计算机工程,2007,33(8):196-198. 被引量:21
  • 3Byrd R H, Noeedal J, Schnabel RB. Representations of quasi--Newton matrices and their use int limited memory methods [J ]. Mathematical Progamming, 1994, 63(2) :129--156.
  • 4Kurtz A J,Mostafa J. Topic detection and interesttracking in a dynamic online news source[C] /// Pro- ceedings of the 3rd ACM/IEEE-CS Joint Conference on Digital Libraries, Washington.. IEEE Computer Society,2003 : 122 - 124.
  • 5Allan J. Topic Detection and Tracking: Event-Based Information Organization [ M]. Dordrecht: Kluwer Academic Publishers, 2002.
  • 6JurafskyD,MartinJH.自然语言处理综述[M].冯志伟,孙乐,泽.北京:电子工业出版社,2005.
  • 7Hulth A, Karlgren J, Jonsson A, et al. Automatic keyword extraction using domain knowledge[C]// Proceedings of the 2nd International Conference on Computational Linguistics and Intelligent Text Processing, London.. Springer-Verlag, 2001 472 - 482.
  • 8Matsuo Y, Ishizuka M. Keyword extraction from a single document using co-occurrence statistical information[J]. International Journal on Artificial Intelligence Tools, 2004,13 ( 1 ) : 157 - 169.
  • 9Nasukawa T, Yi J. Sentiment analysis: capturing favorability using natural language processing[C]// Proceedings of the 2nd International Conference on Knowledge Capture, New York: ACM Press, 2003: 70 - 77.
  • 10Hu M, Liu B. Mining and summarizing customer reviews [C]// Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York : ACM Press, 2004,.168- 177.

引证文献4

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部