摘要
如何高效率的获取满足个性化的需求成为了新时代的一个热门话题,搜索引擎在一定程度上体现了这一点。然而在搜索引擎中,内部分词算法机制是关键环节,它的目的在于选取好的关键字。一个好的分词算法会降低用户搜索信息的时间和难度,大大提高查询信息的效率。然而目前有很多分词算法,它们的性能和效率各不相同,本文的主要研究目的是探讨目前几种比较流行分词器算法的工作机制,根据它们自身的不同特点,在准确率和召回率这两个方面来比较它们的性能,并进一步研究它们是如何处理用户关键字的。
How to efficient access to meet the personalized needs have become a hot topic in the new era, the search engine in a certain extent, a reflection of this. However, in the search engine, the internal segmentation algorithm mechanism is the key link, it is to choose best keywords. A good segmentation algorithm can reduce the time and difficulty for users to search for information, improve the efficiency of query information greatly. However, there are a lot of word segmentation algorithms, their performance and efficiency are different, the main purpose of this study is to investigate the mechanism of several popular word segmentation algorithms, and compare the performance in the precision rate and recall rate based on different characteristics of their own, and further study on how they dispose user key.
出处
《软件》
2013年第7期75-76,120,共3页
Software
基金
大学生创新项目:列线寻呼系统
关键词
智能信息处理
网页处理
切词算法
网络爬虫
Intelligent Information Processing
Webpage Processing
Segmentation algorithm
Web Crawler