期刊文献+

对中国曾有过静态网页数的一种估计 被引量:12

An Estimation of the Growth of Chinese Web Pages
下载PDF
导出
摘要 根据北大天网系统在 2 0 0 2年初的一次网页搜集结果 ,特别是当前仍然存在网上但“最后修改时间”不是 2 0 0 2年网页的数量 ,本文提出了一种估计过去网页数量的模型 ,并据此对中国自从1995年以来的静态网页数以年为时间单位进行了估计 ,从而得到了一个增长曲线 ,在一定的理论意义上验证了人们常说的“网上信息爆炸性增长” It is presented that a model for estimating number of previously existing web pages in terms of last-modify-time attribute of HTTP response header.In particular,this model has been used to estimate the growth of Chinese web pages since 1995.The result strongly supports the common sense that the Web is exponentially growing in terms of annual statistics.
作者 李晓明
出处 《北京大学学报(自然科学版)》 CAS CSCD 北大核心 2003年第3期394-398,共5页 Acta Scientiarum Naturalium Universitatis Pekinensis
基金 98 5资助项目 973 (G19990 3 2 70 6)资助项目
关键词 互联网 因特网 网页 网络动力学 World Wide Web Internet Web pages Web dynamics
  • 相关文献

参考文献9

  • 1中国互联网络信息中心.中国互联网信息资源数量调查报告.hup.//www.cnnic.gov.cn,2001--04.
  • 2李晓明,刘建国.搜索引擎技术及趋势[J].中国计算机用户,2000(9):27-28. 被引量:14
  • 3祝福来.北大天网发布2002年中国网页调查报告[N].计算机世界,2003-01-27,A6版.
  • 4Reka Albert, Hawoong Jeong; Albert-Laszlo Barabasi.Diameter of the World-Wide Web. Nature, 1999,401 (9) : 130-131.
  • 5Bernardo A Huberman, Lada A Adamic. Growth Dynamics of the World-Wide Web. Nature, 1999,401 (9) : 131.
  • 6Gary William Flake, Steve Lawrence, C Lee Giles, et al. Self-organization and Identification of Web Communities. Computer,IEEE Computer Society,2002,35(3) :66 - 71.
  • 7Yan Hongfei, Wang Jianyong, Li Xiaoming, et al. Architectural Design and Evaluation of an Efficient Web-crawling System.Journal of System and Software ,2002,60(3) :185 - 193.
  • 8Junghoo Cho,Hector Gareia-Molina.The Evolution of the Web and Implications for an Incremental Crawler. In: Proceedings of 26^th International Conference on Very Large Databases(VLDB),Cairo,Egypt,September,2000,1 - 18.
  • 9Peter Pirolli,james Pitkow,Ramana Rao.Silk from a Sow's Ear: Extracting Usable Structures from the Web. In: Proc ACM Conf Human Factors in Computing Systems, New York: ACM Press,1996,118-125.

共引文献15

同被引文献103

  • 1魏勇刚,张国春,常勇,袁方.基于词性分析和领域知识的Deep Web语义标注[J].郑州大学学报(理学版),2009,41(1):52-55. 被引量:7
  • 2张志刚,陈静,李晓明.一种HTML网页净化方法[J].情报学报,2004,23(4):387-393. 被引量:57
  • 3车万翔,刘挺,李生.实体关系自动抽取[J].中文信息学报,2005,19(2):1-6. 被引量:115
  • 4梁晗,陈群秀,吴平博.基于事件框架的信息抽取系统[J].中文信息学报,2006,20(2):40-46. 被引量:38
  • 5董静,孙乐,冯元勇,黄瑞红.中文实体关系抽取中的特征选择研究[J].中文信息学报,2007,21(4):80-85. 被引量:55
  • 6S. Brin. Extracting patterns and relations from the world wide web. WebDB Workshop at 6th International Conference on Extending Database Technology, EDBT98,1998 : 172183.
  • 7E. Agichtein and L. Gravano. Snowball: extracting relations from large plain - text collections. Proceedings of the fifth ACM conference on Digital libraries,2000 : 85 - 94.
  • 8E. Agiehtein, L. Gravano, J. Pavel, V. Sokolova, and A. Voskoboynik. Snowball : a prototype system for extracting relations from large text collections. Proceedings of the 2001 ACM SIGMOD international conference on Management of data, 2001.
  • 9O. Etzioni, M. Cafarella, D. Downey, S. Kok, A. Popescu, T. Shaked, S. Soderland, D. Weld, and A. Yates. Webscale information extraction in knowitall : (preliminary results). Proceedings of the 13th international conference on World Wide Web, pages ,2004 :100 - 110.
  • 10M. Pasca. Acquisition of categorized named entities for web search. Proceedings of the Thirteenth ACM conference on Information and knowledge management ,2004 : 137 - 145.

引证文献12

二级引证文献52

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部