摘要
中文分词作为机器翻译、文本分类、主题词提取以及信息检索的基础环节,近年来得到了广泛的关注。搜索引擎技术的广泛应用和中文信息处理的发展,使得全文检索和中文分词技术的研究逐渐深入,涌现出了众多优秀的中文分词算法。本文结合中文分词算法的研究现状,分析了分词技术与搜索引擎的信息检索相结合需要解决的关键技术问题,并讨论了中文分词技术在搜索引擎中的应用。
Chinese word segmentation, as a basic research issue on machine translation, text classification, keywords extraction and information retrieval, recent years has been widespread concerned. The applications of Search engine and the development of Chinese information processing, has bring in-depth researches on full-text retrieval and Chinese word segmentation, and many excellent Chi- nese word segmentation algorithm has been emerged. In this paper, based on existing Chinese word segmentation algorithm, we ana- lyze key technical issues about combining word segmentation and information retrieval, and discuss the applications of Chinese word segmentation technique in search engine.
出处
《计算机与网络》
2010年第1期60-63,共4页
Computer & Network
关键词
搜索引擎
中文分词
信息检索
未登录词识别
歧义消解
Search Engine
Chinese Word Segmentation
Information Retrieval
Out-of-Vocabulary Word Recognition
Disambiguation