摘要
针对目前通用搜索引擎对专门领域及特定主题信息覆盖率较低,在开源的Nutch搜索引擎架构的基础上,通过Hash索引在多语种农业叙词表AGROVOC上进行农业词典的构建,利用已有的空间向量算法进行农业相关度计算,并结合Page-Rank的改进算法对结果综合排序,搭建了一个面向互联网上农业相关信息资源的搜索引擎。相对于通用搜索引擎来说减少了搜索结果的信息量,提高了搜索速度,同时提高了专业信息搜索的准确率。
Owing to the lower coverage of the general search engine in special areas and specific theme, based on the open source structure Nutch search engine, constructing the agriculture dictionary through Hash algorithm and AGROVOC which is a multilingual, structured and controlled vocabulary, using the space vector algorithm to compute the agriculture-related degree and comprehensive sorting the page with the improved PageRank algorithm, which set up an agriculture-related information search engine oriented the internet. Compared to the general search engine, reducing the amount of information search results, improving the search speed and the accuracy of professional information search.
出处
《计算机工程与设计》
CSCD
北大核心
2009年第3期610-612,共3页
Computer Engineering and Design
基金
国家863高技术研究发展计划基金项目(2007AA10Z235
2007AA01Z179)
国家科技支撑计划基金项目(2006BAJ09B04
2007BAD33B01)