期刊文献+

政府网站移动搜索的日志挖掘和个性化改进 被引量:2

Log Mining and Personalization Improvement for Mobile Search System of Government Websites
原文传递
导出
摘要 为充分利用移动搜索和政府网站的特点,发挥Hadoop处理大数据的优势,设计开发了日志挖掘和个性化定制系统。利用Flume和HDFS实现了海量日志的汇总和存储,为日志挖掘提供了数据源和调用接口;采用Map Reduce实现了对日志的高效分析,利用搜索结果网页的标签和导航,建立了网页向量空间模型和用户兴趣模型;根据用户兴趣模型,使用聚类分析中的Kmeans算法将有相似兴趣的用户组成兴趣组;通过计算搜索结果网页到用户所在兴趣组的距离,判断用户对该网页是否感兴趣,据此调整搜索结果的排序,实现个性化搜索和推送功能。 By taking full advantage of the characteristics of mobile search and government website, a log mining and customization system, which makes use of the advantages of Hadoop in large data processing, is designed and developed. First, it uses Flume and HDFS to realize the collection and storage of massive log and to provide source data and program interface of log mining. Second, the system uses MapReduce to efficiently analyze the log by taking advantage of labels and navigation bar of search result pages. Thus, the vector space model of search result pages and user interest model are established. Third, based on user interest model and combined with MapReduce again, the K-means algorithm which is for cluster analysis is used. Then, users are divided into different interest groups depending on their interests. Finally, by calculating the distance between search result page and the user's interest group, whether the user is interested in this page is determined, then the system adjusts the order of search results and pushes a new page to this user accordingly. Therefore, the personalized search and push function are implemented.
作者 叶小榕 邵晴
出处 《科技导报》 CAS CSCD 北大核心 2014年第36期110-116,共7页 Science & Technology Review
关键词 个性化搜索 个性化推荐 聚类分析 Map REDUCE personalized search personalized recommendations cluster analysis MapReduce
  • 相关文献

参考文献12

二级参考文献133

共引文献288

同被引文献20

  • 1国家信息中心网络政府研究中心.中国政府网站发展数据报告(2012)[EB/OL].(2012-12-06)[2013-09-01].http://www.gwd.gov.cn/uploads/worddownload/2012_development_report_of_governments_website.pdf.
  • 2中国软件测评中心.2012年中国政府网站绩效评估总报告[EB/OL].(2012-12-05)[2013-09-01].http://www.cstc.org.cn/zhuanti/fbh2012/zbgl&bg.html.
  • 3Heasoo H, Hady W L, Lise G, et al. Organizing user search histories[J]. IEEE Transactions on Knowledge and Data Engineering, 2012, 24(5): 912-925.
  • 4Qian Xueming, Feng He, Zhao Guoshuai, et al. Personalized recommendation combining user interest and social circle[J]. IEEE Transactions on Knowledge and Data Engineering, 2013, 26(7): 1763-1777.
  • 5The Apache Software Foundation. Public websites using Solr[EB/OL]. (2013-09-19) [2013-10-01]. http://wiki.apache.org/solr/PublicServers.
  • 6Yadav D, Sonia S C,Jorge M, et al.An approach for spatial search using Solr[C]//Confluence 2013: The Next Generation Information Technology Summit (4th International Conference). Noida, India: IET, 2013: 202-208.
  • 7Saravanakumar K, Aswani K C. Optimized web search results through additional retrieval lists inferred using wordnet similarity measure[C]// International Conference on Data Mining and Intelligent Computing 2014. New Delhi, India: IEEE Conference Publications, 2014: 1-7.
  • 8陈红涛,杨放春,陈磊.基于大规模中文搜索引擎的搜索日志挖掘[J].计算机应用研究,2008,25(6):1663-1665. 被引量:16
  • 9刘承启,邓庚盛,江婕,徐健锋.基于用户行为分析的搜索引擎研究[J].计算机与现代化,2008(9):75-77. 被引量:2
  • 10张磊,李亚楠,王斌,李鹏,蒋在帆.网页搜索引擎查询日志的Session划分研究[J].中文信息学报,2009,23(2):54-61. 被引量:16

引证文献2

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部