期刊文献+

网站简约本体垂直搜索系统的设计与实现 被引量:2

Design and implementation of Web concise ontology-base vertical search engine
下载PDF
导出
摘要 针对单个网站构建本体库垂直搜索引擎的过程中,叙词及其间逻辑关系等收集整理所耗人力成本高,导致该技术框架虽成熟,而大多网站搜索功能仍以字符匹配为主,缺乏分词、查询扩展及结果的相关度排序,很难准确命中相关查询内容等问题,设计并开发了一套基于网站简约本体库的垂直搜索系统。该系统以中国气象数据网(http://data.cma.cn)为例,利用protégé根据网站的导航目录,构建了中国气象数据网的本体库,基于Lucene引擎构建技术框架,对本体库中的对象及网页内容分别进行分词,并构建本体对象索引库及网页索引库;前端对查询内容分词后,先在本体对象索引库中进行扩展,利用TF-IDF相关度算法计算扩展结果的相关度并排序,该值作为各扩展本体对象的权值,并将各自的权值动态赋给利用Jena二次语义分析技术扩展的对象,最后将所有带有权值的关键词在网页索引库中查询检索,计算结果相关度并排序。实验结果表明,该系统构建简便,能为用户扩展、推荐相关查询内容,提高了针对网站检索的查准率及查全率。 As the progress is both time and effort consuming to build a Web ontology-based vertical search engine bycollating the descriptors and the relation for each descriptor,it is not suitable for most of website search system but searchengine.And thus,the Web retrieval system remains the character-matching search function which lacks of segmentation,semantic query expansion,ranking the results by semantic relatedness and so on.To solve those problems,a verticalsearch engine based on a concise ontology has been designed and implemented.Taking the case of China MeteorologicalData Service Center(CMDC),firstly,a concise ontology library will be built by protégéwith the list of website navigation,which is used to design a vertical search engine on the frame of Lucene.Meanwhile,the segmentation algorithm(IKanalyzer)is used for this system in the progress of indexing and searching.After that,the semantics is expanded by thesemantic analysis techniques(Jena).Remarkably,the correlation degree of the semantic expansion has been calculatedused as the weight value of each segmented words.This is used to rank the search result by the TF-IDF algorithm.Theresults show that the system can be used to expand and recommend the relative search content,and there is a great promotionof both precision and recall of results within these improvements.
作者 杨和平 陈瑜 张志强 YANG Heping;CHEN Yu;ZHANG Zhiqiang(Division of Data Services, National Meteorological Information Center, Beijing 100081, China;Institute of Plant Protection, Chinese Academy of Agricultural Sciences, Beijing 100193, China;Gembloux Agro-Bio Technology, University of Liège, Gembloux 5030, Belgium)
出处 《计算机工程与应用》 CSCD 北大核心 2017年第19期257-264,共8页 Computer Engineering and Applications
基金 公益性行业(气象)科研专项(重大专项)(No.GYHY(QX)20150600-7) 第五届青年科技基金(No.NMICQJ201604)
关键词 本体库 垂直搜索引擎 语义扩展 中国气象数据网 ontology vertical search engine semantic expansion China Meteorological Data Service Center(CMDC)
  • 相关文献

参考文献12

二级参考文献78

共引文献130

同被引文献14

引证文献2

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部