期刊文献+

基于HITS的冲突Deep Web数据多真值发现算法 被引量:5

Multiple Truth Value Discovery Algorithm for Conflicting Deep Web Data Based on HITS
下载PDF
导出
摘要 目前多数真值发现算法建立在真值唯一的基础上,无法处理多真值的情况。为此,针对冲突Deep Web数据的多真值发现问题,借鉴HITS算法思想,定义视图权威度与视图描述可信度,两者相互影响。在此基础上,定义视图链接关系图,提出多真值迭代发现算法MTF。当算法收敛时,权威度最大的视图就是真值。在Book-Authors数据集上进行实验,结果表明,与基准算法VOTE相比,MTF算法的精确度大幅提高。 Based on the assumption of only one truth value, most of current truth value discovery algorithm cannot process the multiple truth value condition. In order to solve this problem, aiming at the multiple true value discovery problem in conflicting Deep Web data ,this paper defines authority of view and credibility of description, inspired by the idea of Hypertext-Induced Topic Search (HITS) algorithm. The authority of view and the credibility of description depend on each other. On this basis, it constructs link graph of views, and proposes an iterative multiple truth value discovery algorithm, named MTF. When the algorithm converges, the view with maximum authority is the truth value. Experimental results on Book-Authors datesets show that the accuracy of MTF can be improved greatly than standard VOTE algorithm.
出处 《计算机工程》 CAS CSCD 北大核心 2016年第9期158-162,共5页 Computer Engineering
基金 国家社科基金资助项目"基于大数据整合的空气质量测度方法研究"(14GSD95) 全国统计科研基金资助重点项目"海量异源异物数据的采集 存储和分析方案研究"(2013LZ44) 陇原创新人才扶持计划基金资助项目(14GSD95) 甘肃省财政厅高校基本科研业务费基金资助项目(GZ14007 GZ14023)
关键词 WEB数据源 数据模型 可信度 视图 真值发现 Web data source data model credibility view truth value discovery
  • 相关文献

参考文献15

  • 1刘伟,孟小峰,孟卫一.Deep Web数据集成研究综述[J].计算机学报,2007,30(9):1475-1489. 被引量:136
  • 2Zhao Bo, Rubinstein B I P, Jim G, et al. A Bayesian Approach to Discovering Truth from Conflicting Sources for Data Integration I J ]. The VLDB Journal, 2012, 5(6) :550-561.
  • 3Li Xian,Dong X L, Kenneth L, et al. Truth Finding on the Deep Web: Is the Problem Solved? I C]//Pro- ceedings of the 38th International Conference on Very Large Databases. New York, USA: ACM Press, 2012: 97-108.
  • 4Kleinberg J M. Authoritative Sources in a Hyperlinked Environment~ JJ. Journal of the ACM, 1999,46 (5) : 604-632.
  • 5Jeff P,Dan R. Making Better Informed Trust Decisions with Generalized Fact-finding ~ C J//Proceedings of the 22nd International Joint Conference on Artificial Intel- ligence. Barcelona, Spain : ~ s. n. ] , 2011 : 2324-2329.
  • 6Jeff P, Dan R. Knowing What to Believe ( When You Already Know Something) E C 1//Proceedings of the 23rd International Conference on Computational Linguistics. New York ,USA :ACM Press ,2010:877-885.
  • 7Yin Xiaoxin,t-tan Jiawei, Yu P S. Truth Discovery with Multiple Conflicting Information Providers on the Web I J I- IEEE Transactions on Knowledge and Data Engineering, 2008,20(6) :7%-808.
  • 8Dong X L, Laure B, Divesh S. Integrating Conflicting Data : The Role of Source Dependence ~ C 1//Proceedings of the 35th International Conference on Very Large Databases. New York, USA : ACM Press,2009 : 550-561.
  • 9张永新,李庆忠,彭朝晖.基于Markov逻辑网的两阶段数据冲突解决方法[J].计算机学报,2012,35(1):101-111. 被引量:11
  • 10张志强,刘丽霞,谢晓芹,潘海为,方一向.基于数据源依赖关系的信息评价方法研究[J].计算机学报,2012,35(11):2392-2402. 被引量:14

二级参考文献91

  • 1徐士良.计算机常用算法[M].北京:清华大学出版社,1994..
  • 2沈剑华.计算数学基础[M].上海:同济大学出版社,1987..
  • 3Lowd D, Domingos P. Efficient weight learning for Markov logic networks//Proceedings of the llth European Confer- ence on Principles and Practice of Knowledge Discovery in Databases (PKDD). Warsaw, Poland, 2007:200-211.
  • 4Poon H, Domingos P. Sound and efficient Inference with probabilistic and deterministic dependencies//Proceedings ofthe 21st National Conference on Artificial Intelligence (AAAI). Boston, Massachusetts, USA, 2006:458-463.
  • 5Gilks W R, Richardson S, Spiegelhalter D J. Markov Chain Monte Carlo in Practice. London, UK: Chapman and Hall, 1996.
  • 6Dong X, Naumann F. Data fusion Resolving data conflicts for integration//Proceedings of the 35th International Con ference on Very Large Databases (VLDB). Lyon, France, 2009:1654-1655.
  • 7Galland A, Abiteboul S, Marian A, Senellart P. Corrobora- ting information from disagreeing views//Proceedings of the 3rd International Conference on Web Search and Web Data Mining(WSDM). New York, USA, 2010:131-140.
  • 8Bleiholder J, Naumann F. Conflict handling strategies in an integrated information system//Proceedings of the [nterna tional Workshop on Information Integration on the Web (IIWeb). Edinburgh, UK, 2006.
  • 9Bleiholder J, Naumann F. Data fusion. ACM Computing Surveys, 2008, 41(1): 1-41.
  • 10Richardson M, Domingos P. Markov logic networks. Machine Learning, 2006, 62(1 2): 107 136.

共引文献156

同被引文献25

引证文献5

二级引证文献16

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部