Hidden Markov Model(HMM) is a main solution to ambiguities in Chinese segmentation and POS (part-of-speech) tagging. While most previous works for HMM-based Chinese segmentation and POS tagging consult POS information...Hidden Markov Model(HMM) is a main solution to ambiguities in Chinese segmentation and POS (part-of-speech) tagging. While most previous works for HMM-based Chinese segmentation and POS tagging consult POS information in contexts, they do not utilize lexical information which is crucial for resolving certain morphological ambiguity. This paper proposes a method which incorporates lexical information and wider context information into HMM. Model induction and related smoothing technique are presented in detail. Experiments indicate that this technique improves the segmentation and tagging accuracy by nearly 1%.展开更多
基金国家高技术研究发展计划(863计划),the National Natural Science Foundation of China
文摘Hidden Markov Model(HMM) is a main solution to ambiguities in Chinese segmentation and POS (part-of-speech) tagging. While most previous works for HMM-based Chinese segmentation and POS tagging consult POS information in contexts, they do not utilize lexical information which is crucial for resolving certain morphological ambiguity. This paper proposes a method which incorporates lexical information and wider context information into HMM. Model induction and related smoothing technique are presented in detail. Experiments indicate that this technique improves the segmentation and tagging accuracy by nearly 1%.