期刊文献+

一种Web数据库大小估算新方法 被引量:1

A New Method for Estimating the Size of Web Database
下载PDF
导出
摘要 为估算Web数据库大小,提出了一种基于属性相关度和样本独立特性的Web数据库大小估算方法。首先通过中科院分词系统ICTCLAS对通过提交查询获得文本属性值进行分词以便计算属性相关度,再通过属性的相关性获得属性近似独立样本,进而依据样本的独立性来估算Web数据库的大小。并通过实验验证,本方法能获得较高的准确性。 This paper proposed a new method based on the attribute relevance to estimate the size of Web database. Firstly,ICTCLAS was used to divide the values in the text attributes,which were acquired according to queries,to compute the attribute relevance. Then,an attribute approximately independent sample was gained based on the above relevance,and the size of database was estimated according to the independence of sample. The experiment had proved that this approach achieved more high accuracy.
出处 《信息技术与信息化》 2010年第2期63-66,共4页 Information Technology and Informatization
关键词 DEEP WEB 属性相关度 Web数据库大小估算 Deep Web The attribute relevance Web database size estimating
  • 相关文献

参考文献1

二级参考文献14

  • 1Chang KCC, Cho J. Accessing the Web: From search to integration. In: Proc. of 2006 ACM SIGMOD Int'l Conf. on Management of Data (SIGMOD 2006). Chicago: ACM Press, 2006. 804-805.
  • 2Cope J, Craswell N, Hawking D. Automated discovery of search interfaces on the Web. In: Proc. of the 14th Australasian Database Conf. (ADC 2003). Adelaide: Australian Computer Society Press, 2003. 181-189.
  • 3Kabra G, Li C, Chang KCC. Query routing: Finding ways in the maze of the deep Web. In: Proc. of the Int'l Workshop on Challenges in Web Information Retrieval and Integration (WIR12005). Tokyo: IEEE Computer Society Press, 2005. 64-73.
  • 4He H, Meng W, Yu CT, Wu Z. WISE-Integrator: An automatic integrator of Web search interfaces for e-commerce. In: Proc. of the 29th Int'l Conf. on Very Large Data Bases (VLDB 2003). Berlin: ACM Press, 2003.357-368.
  • 5Wu W, Doan A, Yu CT. WebIQ: Learning from the Web to match deep-Web query interfaces. In: Proc. of the 22rid Int'l Conf. on Data Engineering (ICDE 2006). Atlanta: IEEE Computer Society Press, 2006.44.
  • 6Zhai Y, Liu B. Web data extraction based on partial tree alignment. In: Proc. of the 14th Int'l World Wide Web Conf. (WWW 2005). Chiba: ACM Press, 2005.76-85.
  • 7Zhao H, Meng W, Wu Z, Raghavan V, Yu CT. Fully automatic wrapper generation for search engines. In: Proc. of the 14th Int'l World Wide Web Conf. (WWW 2005). Chiba: ACM Press, 2005, 66-75.
  • 8Raghavan S, Garcia-Molina H, Crawling the hidden Web. In: Proc. of the 27th Int'l Conf. on Very Large Data Bases (VLDB 2001). Rome: ACM Press, 2001. 129-138.
  • 9Wu P, Wen JR, Liu H, Ma WY. Query selection techniques for efficient crawling of structured Web sources. In: Proc. of the 22nd Int'l Conf, on Data Engineering (ICDE 2006). Atlanta: IEEE Computer Society Press, 2006.47-58.
  • 10BrightPlanet.com. The deep Web: Surfacing hidden value. 2000. http://brightplanet.com

共引文献29

同被引文献2

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部