期刊文献+

实体信息集成检索的深网数据源选择 被引量:2

Deep Web Data Source Selection for Entity Information Integrated Retrieval
下载PDF
导出
摘要 在深网集成检索中,用户通常希望仅向少量数据源提交查询即可获得高质量的检索结果,因而数据源选择成为关键问题。为提升实体信息集成检索的效率,提出一种考虑相关性和重复度的数据源选择方法。给出基于主题与情感词的深网数据源摘要构建方法,利用用户反馈识别实体信息的主题类别,根据情感词度量数据源内容之间的重复性,并结合主题相关性和内容重复度设计相应的深网数据源计分策略。实验结果表明,该方法可以基于小数据摘要获得较高的准确率与召回率,为实体信息集成检索提供有效支撑。 People usually want to submit queries to only a few data sources to obtain high quality search results, so data source selection becomes a key issue in Deep Web integrated retrieval. To enhance the efficiency of entity data integrated retrieval,this paper designs a data source selection method based on relevance and repeatability. Firstly, it proposes a summary construction method based on subject and emotional words. The above method identifies subject category of entity information based on user feedback and calculates the data repeatability between two Deep Webs based on emotional words. Then, it proposes a Deep Web data source scoring strategy based on query subject relevance and repetition of content. Experimental result shows that the proposed method has higher accuracy and recall, although using a small data summary. It can orovide an effective suonort to entity infnrrnation integrated retrieval.
作者 邓松
出处 《计算机工程》 CAS CSCD 北大核心 2016年第10期75-79,共5页 Computer Engineering
基金 国家自然科学基金资助项目(61462037 61563016) 江西省自然科学基金资助项目(20142BAB217014 20142BAB207009) 江西省研究生创新基金资助项目(YC2012-B021)
关键词 与主题词 主题词与特征词和直方图的关键 数据源选择 深网 实体 信息集成 用户反馈 data source selection Deep Web entity information integration user feedback
  • 相关文献

参考文献16

  • 1李道申,刘勇.基于本体的DeepWeb数据源发现方法[J].计算机工程,2012,38(4):52-54. 被引量:1
  • 2万常选,邓松,刘喜平,廖国琼,刘德喜,江腾蛟.Web数据源选择技术[J].软件学报,2013,24(4):781-797. 被引量:16
  • 3Balakrishnan R,Kambhampati S. Source Rank: Relevance and Trust Assessment for Deep Web Sources Based on Inter-source Agreement [ C ]//Proceedings of the 20th International Conference on World Wide Web. New York, USA :ACM Press,2011:227-236.
  • 4Dong X L, Saha B, Srivastava D. Less Is More: Selecting Sources Wisely for Integration [ C ]//Proceedings of the 39th International Conference on Very Large Data Bases. [ S. 1. ] :Morgan Kaufmann Publishers,2013 : 37-48.
  • 5Rekatsinas T, Dong X L. Finding Quality in Quantity: The Challenge of Discovering Valuable Sources for Integration [ C ]//Proceedings of the 7th Biennial Con- ference on Innovative Data Systems Research. New York, USA:ACM Press ,2015 : 1-7.
  • 6邓松,万常选,刘喜平,廖国琼.基于用户反馈的深网数据源选择[J].小型微型计算机系统,2012,33(11):2367-2371. 被引量:3
  • 7Rekatsinas T, Dong X L. Characterizing and Selecting Fresh Data Sources [ C ]//Proceedings of 2014 ACMSIGMOD International Conference on Management of Data. New York, USA : ACM Press ,2014:919-930.
  • 8范举,周立柱.基于关键词的深度万维网数据库选择[J].计算机学报,2011,34(10):1797-1804. 被引量:11
  • 9Wang Ying, Zuo Wanli, He Fengling, et al. Ontology- assisted Deep Web Source Selection [J]- Computer Science for Environmental Engineering and Ecolnformatics, 2011,159(2) :66-71.
  • 10万常选,邓松,刘德喜,江腾蛟,刘喜平.面向混合类型关键词查询的非合作结构化深网数据源选择[J].计算机研究与发展,2014,51(4):905-917. 被引量:6

二级参考文献62

共引文献70

同被引文献11

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部