摘要
微博作为信息发布和获取的重要手段,已成为最重要的媒体之一。用户每天在微博上发言,其内容隐含着许多重要话题。在话题检测过程中,话题网络构建是一项最基本的内容。将微博发言作为节点,将节点间包含共同的词汇作为边,应用MapReduce编程模型作为海量数据处理的平台,研究了微博信息中大规模话题网络的构建方法。实验表明,基于MapReduce构建的话题网络符合社会网络的相关性质,并且其话题预测的准确性也高于基于LDA模型的话题检测。
Microblog,as a new way of information sharing and acquiring,has become one of the most important media.Everyday people post on Microblog,and these posts contain many hot topics.During topic detection,construction of topic network is a basic step.Consuming posts as nodes,and common words between two nodes as edges,this paper applies MapReduce as the platform of massive data processing,and studies how to construct topic network in Microblog. Experiments show that the topic network constructed by our MapReduce-based method conforms to the related attributes of social network,and the accuracy of topic detection based on our net-work is better than the LDA-based topic detection.
出处
《淮海工学院学报(自然科学版)》
CAS
2014年第2期40-44,共5页
Journal of Huaihai Institute of Technology:Natural Sciences Edition
基金
无锡科技职业学院政产学研合作(共推互聘)科技项目(SG14029)