期刊文献+

适应性阈值优化的微博消息索引模式

Adaptive Microblog message indexing schema
下载PDF
导出
摘要 为提高微博搜索的准确性,提出一种适应性的微博消息索引模式。将微博消息的转发和回复表示为树形结构并进行编码;提出一种基于内容和排名的索引模式,根据新消息的到来适应性地调整内存中的索引数据;为避免检索过程扫描整个微博数据集,提出一种Top-k阈值优化方法。Twitter数据实验结果表明,该模式降低了微博数据索引时的时间和空间开销,其性能随着时间的推移比较稳定。 To improve the accuracy of Microblog searching,an adaptive Microblog message indexing schema was proposed. Firstly,trees were constructed according to the forward and reply of messages,and these trees were encoded.Secondly,content and rank based indexing schema was proposed,and the index structure in memory was updated adaptively when a new message came.Finally,to avoid scanning the whole Microblog data,a Top-k threshold optimization method was proposed.Results of ex-periments on Twitter data set show that,the proposed index schema reduces the time and space cost while indexing the Microb-log messages,and its performance is stable along with time.
作者 张莉 李卫平
出处 《计算机工程与设计》 北大核心 2015年第5期1362-1367,共6页 Computer Engineering and Design
基金 公安部重大基金项目(201202ZDYJ017) 河南省教育厅科学技术研究重点基金项目(14A520011)
关键词 微博 信息检索 索引模式 阈值 社会网络 Microblog information retrieval indexing schema threshold social network
  • 相关文献

参考文献17

  • 1Teevan J, Ramage D, Morris MR. TwitterSearch.. A com- parison of microblog search and Web search [C] //Proceedings of the 4th ACM International Conference on Web Search and Data Mining, 2011: 35-44.
  • 2廉捷,周欣,曹伟,刘云.新浪微博数据挖掘方案[J].清华大学学报(自然科学版),2011,51(10):1300-1305. 被引量:119
  • 3Efron M. Information search and retrieval in microblogs [J]. Journal of the American Society for Information Science and Technology, 2011, 62 (6): 996-1008.
  • 4Agarwal A, Xie B, Vovsha I, et al. Sentiment analysis of twitter data [C] //Proceedings of the Workshop on Languages in Social Media. Association for Computational Linguistics, 2011; 30-38.
  • 5Chen C, Li F, Ooi BC, et al. Ti: An efficient indexing mec- hanism for real-time search on tweets [C] //Proceedings of ACM SIGMOD International Conference on Management of da- ta, 2011: 649-660.
  • 6Zhao X, Jiang J, He J, et al. Topical keyphrase extraction from twitter [C] //Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics, 2011: 379-388.
  • 7Leskovec J, Backstrom L, Kleinberg J. Meme-tracking and the dynamics of the news cycle [C] //Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Disco- very and Data Mining, 2009: 497-506.
  • 8Wu S, Li J, Ooi BC, et al. Just-in-time query retrieval over partially indexed data on structured P2P overlays [C] //Pro- ceedings of ACM SIGMOD International Conference on Ma- nagement of Data, 2008: 279-290.
  • 9Budak C, Abbadi AEL. Information diffusion in social net-works: Observing and influencing societal interests [J]. PV- LDB, 2010, 4 (12): 1-5.
  • 10GaonkarS, Li J, Choudhury R R, et al. Micro-biog.. Sha- ring and querying content through mobile phones and social participation [C] //Proceedings of the 6th International Con- ference on Mobile Systems, Applications and Services. ACM, 2008: 174-186.

二级参考文献11

  • 1欧健文,董守斌,蔡斌.模板化网页主题信息的提取方法[J].清华大学学报(自然科学版),2005,45(S1):1743-1747. 被引量:70
  • 2周立柱,林玲.聚焦爬虫技术研究综述[J].计算机应用,2005,25(9):1965-1969. 被引量:153
  • 3Pieter N, Michiel H. Mining Twitter in the cloud: A case study [C]// Proceedings of the 2010 IEEE 3rd International Conference on Cloud Computing, CLOUD 2010. Miami, USA: IEEE Computer Society, 2010: 107 -114.
  • 4Abraham R, Martinez T. Twitter: Network properties analysis [C]// Proceedings of the CONIELECOMP 2010 20th International Conference on Electronics Communications and Computers. Cholula Puebla, Mexico: IEEE Computer Society, 2010: 180 - 184.
  • 5wenE,SunV.新浪微博研究报告[Z/OL].(2011-05-20),http://www.techweb.com.cn/data/2011-02-25/916941.shtml.
  • 6HAN Ruixia. The influence of microblogging on personal public participation [C]// Proceedings of the 2010 IEEE 2nd Symposium on Web Society, SWS 2010. Beijing, China: Association for Computing Machinery, 2010:615 -618.
  • 7KANG Shulong, ZHANG Chuang. Complexity research of massively microhlogging based on human behaviors [C]//2010 2nd International Workshop on Database Technology and Applications, DBTA2010 Proceedings. Wuhan, China: IEEE Computer Society, 2010: 1 -4.
  • 8WANG Rui, JIN Yongsheng. An empirical study on the relationship between the followers' number and influence of microblogging [C]// Proceedings of the International Conference on E-Business and E-Government, ICEE 2010. Guangzhou, China: IEEE Computer Society, 2010: 2014- 2017.
  • 9Westman S, Freund L characters or less : Genres on interaction in 140 twitter [C]//IIiX 2010 Proceedings of the 2010 Information Interaction in Context Symposium. New Brunswick, USA: Association for Computing Machinery, 2010:323 - 326.
  • 10姚峰.Java平台中Base64编码/解码算法的改进[J].计算机应用与软件,2008,25(12):164-165. 被引量:10

共引文献118

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部