摘要
本文在中文分词技术的基础上,提出了一种基于中文文本主题提取的分词方法,以概念语义网络的思想构造主题词典,描述词间概念语义关系,采用改进的最大匹配算法对文本进行切词,既提高了分词的准确性,又能识别文中的未登录词,并同步完成主题词的规范工作。从而在概念层次上理解用户的需求,实现概念检索,提高查准率。
This paper puts forward a word segmentation method based on text subject extraction. It provides an Chinese Search Engine model based on Concept Retrieval, so that we can understand the user's requirement in conceptive level and fulfill the concept search and enhance the rate of precision. It uses an improved MM segmentation algorithm and constructs concept semantic network as dictionary, and standardizes thematic Words.
出处
《科技信息》
2010年第35期58-58,49,共2页
Science & Technology Information
关键词
中文分词
概念检索
词频统计
Chinese Word Segmentation
Concept Retrieval
Frequency Statistics