摘要
文本分词是各个互联网领域中的基础性工作。通过对平台涉及的文本串进行切词处理,对切词之后的短文本串更能够聚合用户。隐马尔可夫模型作为机器学习领域中重要算法,它能够进行各个状态之间的转换,对于文本中词语之间上下文语义关系、词语与词语之间前后向位置关系非常匹配,众多的开源分词工具都基于隐马尔可夫模型。
Text segmentation as in every sector of the Internet infrastructure work through of platform is related to the text string segmentation processing, segmentation of the passage this string can aggregate user.Hidden Markov model as a machine learning algorithm in ifeld, it is possible to transitions between the various states, for back and forth between text between the words in context semantic relation, words and the words to the position relationship very matching, many open-source word tools are based on the hidden Markov model.
出处
《无线互联科技》
2016年第13期106-107,共2页
Wireless Internet Technology