摘要
垂直搜索引擎是搜索引擎领域的行业化分工,根据地学信息领域的行业特征、整体需求及其工作流程,在Nutch开源搜索引擎平台上添加了"庖丁解牛"中文分词算法、主题相关度评分算法、"主题词管理"选项等技术,建立了基于网络蜘蛛模型的面向地学信息领域的垂直搜索引擎。经过测试及结果比较,该系统相对于通用搜索引擎有明显的优势,使地学信息的定位和查找更加准确。该系统具有良好的扩展性和通用性,对垂直搜索引擎的研究和开发具有一定的借鉴作用。
Vertical search engines are the industrial division of comprehensive search engines,and in this paper,according to the industry characteristics,the overall demand and the workflow of geosciences field,the"Paodingjieniu" Chinese word segmentation algorithm,the subject-correlation judgment function and the"Subject Management" option are added to the Nutch system,thus establishing the vertical search engine for geosciences.The online test and result comparison show that this system has obvious advantages compared with universal search engines,making it more accurate to locate and search geo-information.Besides,the system has good extensibility and versatility,providing some reference to the vertical search engine research and development.
出处
《计算机工程与应用》
CSCD
2012年第33期85-88,95,共5页
Computer Engineering and Applications
基金
国家自然科学基金(No.2011093051)
中国博士后科学基金(No.2011M501260)
湖北省自然科学基金(No.2010CDB04104)