摘要
随着信息检索技术越来越受到人们的重视,其中的中文自动分词技术就显得越来越重要。计算机通过对文本中的词语进行识别与处理,直接将结果传送给搜索引擎进行检索。文中在域内资源整合系统(DRIS)的基础上,设计并开发了一个全新的中文自动分词模块。通过算法的比对,选择了正向匹配算法作为该模块的基本算法,对其文件结构,中文词典初始化及识别处理过程做出详细介绍。经后期使用后可比较出该模块在检索效率与服务质量上都有了很大程度上的提高,达到了设计要求。
With more and more get people's attention, information retrieval technology of Chinese automatic word segmentation technology becomes more and more important. Computer through to identify and handle the words in the text,direct to send search engine search results. Domain resource integration system(DRIS) is presented in this paper, on the basis of design and develop a new Chinese automatic word segmentation module. Through the alignment algorithm, a forward matching algorithm as the basic algorithm of the module, the file structure, Chinese dictionary initialized and recognition process in detail. After the late use comparable out the module on the retrieval efficiency and quality of service has improved to a great extent, has reached the design requirements.
出处
《电子设计工程》
2016年第14期158-160,共3页
Electronic Design Engineering