期刊文献+

基于百度指数的网页用户关注度研究 被引量:13

Research on Degree of User Attention to Webpage Based on Baidu Index
下载PDF
导出
摘要 针对海量的互联网信息,提出一种计算网页理论用户关注度的方法,以期提高网络信息搜索的效率,改进搜索排名的准确性。本文以中文论坛的新闻网页为研究对象,利用百度指数提供的用户搜索信息,通过正文抽取、特征项提取、关注度计算等步骤,实现面向内容分析的网页理论用户关注度的计算,最后对150条网页进行实验和回归分析。结果表明,特征词提取的最佳个数为3,理论用户关注度与实际用户关注度(点击量)的相关系数达0.8以上,说明该方法具有一定的准确性。 Due to the mass information on Internet, efficient information retrieval has become the topic of interest for both academia and industry. In order to improve the accuracy of search engine' s rank algorithm, this paper proposes an algorithm to determine the theoretical degree of user attention to webpages. We select news webpages from Chinese forum as the object of our study. With Baidu Index, we design a content-oriented algorithm for theoretical degree of user attention to webpages through such steps as extracting web content, selecting feature vectors of webpage and so on. Experiment and regression analysis are conducted on 150 webpages. The result indicates that the optimal number of feature selection is 3 and the correlation coefficient between the theoretical degree of user attention and the actual degree of user attention ( net page views) is over 0. 8, proving the validity of our method.
出处 《情报学报》 CSSCI 北大核心 2012年第8期837-845,共9页 Journal of the China Society for Scientific and Technical Information
基金 国家自然科学基金资助项目(70971099) 中央高校基本科研业务费专项资金资助.
关键词 用户关注度 百度指数 网页特征项 回归分析 degree of user attention, baidu index, feature vector of webpage, regression analysis
  • 相关文献

参考文献14

  • 1中国互联网络信息中心(CNNIC).第27次中国互联网络发展状况统计报告[R],北京:中国互联网络信息中心,2011,.
  • 2百度指数帮助[DB/OL].[2011-01-10].http://www.baidu.com/search/index_help.html.
  • 3杨靖韬,张国平.浅析对网络热点话题的发现与识别研究[J].科技创业月刊,2010,23(8):173-174. 被引量:2
  • 4王勇,刘奕群,张敏,马少平,茹立云.基于用户兴趣分析的网页生命周期建模[J].中文信息学报,2008,22(2):76-80. 被引量:5
  • 5Zhongming Ma, Gautam Pant,Olivia R Liu Sheng. Interest- based personalized search [ J ]. ACM Transactions on Information Systems (TOIS) ,2007,25 ( 1 ) : 1-38.
  • 6Gauch S, Speretta M, Chandramouli A, et al. User Profiles for Personalized Information Access [ C ]//The Adaptive Web: Methods and Strategics of Web Personalization, 2007 : 54-89.
  • 7Liu Jiahui Peter, Elin Dolan, Pedersen Ronby. Persona- lized News Recommendation Based on Click Behavior [ C ]//Proceedings of the 14th ACM International Conference on Intelligent User Interfaces,2010:31-40.
  • 8Su Xiaoyuan,Khoshgoftaar T M. A survey of collabo-rative filtering techniques [ J ]. Advances in Artificial Intelligence ,2009 ( 2009 ) : 1-19.
  • 9Xu Songhua, Zhu Yi, Jiang Hao, et al. A User-Oriented Webpage Ranking Algorithm Based on User Attention Time[ C]//Proceedings of the 23rd AAAI Conference on Artificial Intelligence,2008 : 1255-1260.
  • 10李志福,方勇,周安民,游思佳,曹雨,钟蜜,阎铁麟.网页关注度的一种计算算法研究[J].计算机应用研究,2009,26(1):132-133. 被引量:1

二级参考文献48

  • 1H Y Tan. Chinese place automatic recognition research. In: C N Huang, Z D Dong, eds. Proc of Computational Language.Beijing: Tsinghua University Press, 1999
  • 2Zhang Huaping, Liu Qun, Zhang Hao, et al. Automatic recognition of Chinese unknown words recognition. First SIGHAN Workshop Attached with the 19th COLING, Taipei, 2002
  • 3S R Ye, T S Chua, J M Liu. An agent-based approach to Chinese named entity recognition. The 19th Int'l Conf on Computational Linguistics, Taipei, 2002
  • 4J Sun, J F Gao, L Zhang, et al. Chinese named entity identification using class-based language model. The 19th Int'l Conf on Computational Linguistics, Taipei, 2002
  • 5Lawrence R Rabiner. A tutorial on hidden Markov models and selected applications in speech recognition. Proc of IEEE, 1989,77(2): 257~286
  • 6Shai Fine, Yoram Singer, Naftali Tishby. The hierarchical hidden Markov model: Analysis and applications. Machine Learning,1998, 32(1): 41~62
  • 7Richard Sproat, Thomas Emerson. The first international Chinese word segmentation bakeoff. The First SIGHAN Workshop Attached with the ACL2003, Sapporo, Japan, 2003. 133~143
  • 8J Hockenmaier, C Brew. Error-driven learning of Chinese word segmentation. In: J Guo, K T Lua, J Xu, eds. The 12th Pacific Conf on Language and Information, Singapore, 1998
  • 9Andi Wu, Zixin Jiang. Word segmentation in sentence analysis.1998 Int'l Conf on Chinese Information Processing, Beijing, 1998
  • 10D Palmer. A trainable rule-based algorithm for word segmentation. The 35th Annual Meeting of the Association for Computational Linguistics (ACL'97), Madrid, 1997

共引文献214

同被引文献109

引证文献13

二级引证文献73

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部