期刊文献+

基于完全二阶隐马尔可夫模型的汉语词性标注 被引量:25

Chinese Part-of-speech Tagging Based on Full Second-order Hidden Markov Model
下载PDF
导出
摘要 该文基于隐马尔可夫理论,提出了一种三元词汇概率和词性概率相结合的汉语词性标注模型,并对传统的Viterbi算法进行了扩展。对统计模型中出现的数据稀疏问题,给出了基于线性插值法的平滑算法。实验表明,完全二阶隐马尔可夫模型比标准的二元、三元模型有更高的词性标注正确率和消歧率。 This paper describes an extension to the hidden Markov model for Chinese part-of-speech tagging using second-order approximations for both contextual and lexical probabilities, as well as the traditional Viterbi algorithm is extended. The model makes use of more contextual information than standard statistical models. A smoothing algorithm based on the linear interpolation algorithm is introduced to solve the sparse data problem. The new full second-order HMM is proved to improve Chinese part-of-speech tagging accuracies and disambiguation accuracies over current models.
出处 《计算机工程》 EI CAS CSCD 北大核心 2005年第10期177-179,共3页 Computer Engineering
基金 国家自然科学基金资助项目
关键词 完全二阶隐马尔可夫模型 汉语词性标注 平滑算法 VITERBI算法 Full second-order hidden Markov model Chinese part-of-speech tagging Smoothing algorithm Viterbi algorithm
  • 相关文献

参考文献8

  • 1周强.规则和统计相结合的汉语词类标注方法[J].中文信息学报,1995,9(3):1-10. 被引量:43
  • 2魏欧,吴健,孙玉芳,sonata.iscas.ac.cn.基于统计的汉语词性标注方法的分析与改进[J].软件学报,2000,11(4):473-480. 被引量:31
  • 3Roth D, Zelenko D. Part of Speech Tagging Using a Network of Linear Separators. Coling-ACL, 1998: 1136- 1142
  • 4Sun Jian, Wang Wei, Zhong Yixin. Grammatical Category Disambiguation Based on Second Order Hidden Markov Model.Systems, Man, and Cybernetics, 2001 IEEE International Conference on, 2001, (10): 887-891
  • 5付国宏 王晓龙.[D].哈尔滨:哈尔滨工业大学计算机科学与技术学院,2001.
  • 6白拴虎 夏莹 黄昌宁.汉语语料库词性标注方法研究[J].机器翻译研究进展,1992,:408-418.
  • 7Thede S M, Harper. M P. A Second-order Hidden Markov Model for Part-of-speech Tagging. The 37th Annual Meeting of the Association for Computation Linguistics (ACL-99) College Park MD, USA,1999-06
  • 8Jelinek F. Statistical Methods for Speech Recognition. The MIT Press,1997

二级参考文献11

共引文献66

同被引文献153

引证文献25

二级引证文献89

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部