期刊文献+

基于Nutch的医疗搜索引擎的研究与开发 被引量:3

Research and Development of Medical Search Engine Based on Nutch
下载PDF
导出
摘要 针对当前大众借助网络获取医疗信息的需求日益增强,以及通用搜索引擎获取专业领域信息时准确性差、效率低下的缺点,本文设计了基于nutch组件的医疗垂直搜索引擎.该系统实现了中文分词功能,通过文本训练得出了专业词库,运用空间向量模型算法对网页进行医疗主题相关度的计算,实现了网页过滤功能,并在排序算法中加入了主题相关因素.测试结果表明:该系统相对于通用搜索引擎,在获取医疗行业信息方面具有更高查准率,减少了不相关信息的干扰,使医疗信息的查找与定位更精确,能够为大众提供更具针对性的服务. As the demands of public access to medical information with the help of network is growing, and when people use general search engines get professional information accuracy is poor and ineffcient. This paper designs a medical vertical search engine based on nutch components. The system realized the function of Chinese word segmentation.It also obtained Term Library by training texts. Using of SVM, the engine calculated the correlation between web page and medical domain.It realized the function of web page filtering. Finally,this system joined the theme relevant factors in the sorting algorithm.Test results show that,comparing with the general search engine,this system has a higher accuracy in terms of access to health information. It can reduce the interference of irrelevant information,to make finding and positioning medical information more accurate.So this system can provide the public with more targeted services.
出处 《新疆大学学报(自然科学版)》 CAS 2014年第2期217-221,共5页 Journal of Xinjiang University(Natural Science Edition)
基金 地区科学基金(61262087)
关键词 垂直搜索引擎 医疗信息 中文分词 文本分类 Nutch vertical search engine Nutch medical information chinese word segmentation text catego-rization
  • 相关文献

参考文献7

二级参考文献38

共引文献287

同被引文献48

引证文献3

二级引证文献7

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部