期刊文献+

基于Hadoop的朴素贝叶斯算法在中文微博情感分类中的研究与应用 被引量:4

RESEARCH AND APPLICATION OF HADOOP-BASED NAVE BAYES ALGORITHM IN SENTIMENT CLASSIFICATION OF CHINESE MICROBLOGGING
下载PDF
导出
摘要 通过对文本情感分类的研究,考虑微博文本信息的篇幅短小、情感符号丰富及大量网络词汇的特点,提出一种适用于中文微博情感分类的基于Map/Reduce的分布式朴素贝叶斯算法。算法通过构建适用于微博文本的情感词典来完成情感特征属性的提取,以期达到较为理想的分类效果。实验结果表明,这种方法能够很好地适用于微博情感分类,达到较理想的分类效果,满足针对海量的微博文本数据处理的可行性与高效性的需求。 Based on the research of text sentiment classification, and considering the characteristics of microblogging text information in short message length, abundant emotional icons and a great deal of network vocabularies, we propose a distributed NaYve Bayes algorithm which is based on Map/Reduce programming model and suitable for sentiment classification of Chinese microblogging. The algorithm implements the extraction of emotional features attributes by constructing a microblogging text-fitted dictionary of emotional information in order to achieve fairly ideal classification effect. Results of experiment show that the method can well adapt for sentiment classification of microblogging and realises ideal classification effect for meeting the demand of feasibility and efficiency in light of massive microblogging text data l^rocessing.
出处 《计算机应用与软件》 CSCD 2015年第7期60-62,142,共4页 Computer Applications and Software
基金 上海市自然科学基金项目(09ZR1409500)
关键词 微博 情感分类 HADOOP MAP/REDUCE 朴素贝叶斯 Microblogging Sentiment classification Hadoop Map/Reduce Naive Bayes algorithm
  • 相关文献

参考文献11

  • 1Davidiv D,Tsur O,Rappoport A.Enhanced Sentiment Learning Using Twitter Hash-Tags and Smileys[C]//Proc of COLING,2010:241-249.
  • 2Barbosa L,Feng J.Robust Sentiment Detection on Twitter from Biased and Noisy Data[C]//Proc of COLING,2010:33-44.
  • 3Jiang L,Yu M,Zhou M,et al.Target-Dependent Twitter Sentiment Classification[C]//Proc of the 49thAnnual Meeting of the Applicational Linguistics,2011:151-160.
  • 4Tan L K-W,Na J-C,Chang K Y.Sentence-Level Sentiment Polarity Classification Using a Linguistic Approach[C]//Proc of ICADL,2011:77-87.
  • 5Pak A,Paroubek P.Twitter as a Corpus for Sentiment Analysis and Opinion Mining[C]//Proc of LREC,2010:1320-1327.
  • 6林江豪,阳爱民,周咏梅,陈锦,蔡泽键.一种基于朴素贝叶斯的微博情感分类[J].计算机工程与科学,2012,34(9):160-165. 被引量:44
  • 7庞磊,李寿山,周国栋.基于情绪知识的中文微博情感分类方法[J].计算机工程,2012,38(13):156-158. 被引量:32
  • 8王珊,王会举,覃雄派,周烜.架构大数据:挑战、现状与展望[J].计算机学报,2011,34(10):1741-1752. 被引量:616
  • 9ViktorMayer-Schonberger,著.盛杨燕,周涛,译.大数据时代[M].浙江:浙江人民出版社,2013.
  • 10Jeffrey D,Sanjay G.MapReduce:Simplified Data Processing on Large Clusters[J].Communications of the ACM.January 2008:107-113.

二级参考文献76

  • 1张玉芳,彭时名,吕佳.基于文本分类TFIDF方法的改进与应用[J].计算机工程,2006,32(19):76-78. 被引量:121
  • 2[OL].<http://hadoop.apache.org.>.
  • 3WinterCorp: 2005 TopTen Program Summary. http:// www. wintercorp, com/WhitePapers/WC TopTenWP. pdf.
  • 4TDWI Checklist Report: Big Data Analytics. http://tdwi. org/research/2010/08/Big-Data-Analytics, aspx.
  • 5Chaudhuri S, Dayal U. An overview of data warehousing and OLAP technology. SIGMOD Rec, 1997,26(1): 65-74.
  • 6Madden S, DeWitt D J, Stonebraker M. Database parallelism choices greatly impact scalability. DatabaseColumn Blog. http://www, databasecolumn, com/2007/10/database-parallelism-choices, html.
  • 7Dean J, Ghemawat S. MapReduce: Simplified data processing on large clusters//Proceedings of the 6th Symposium on Operating System Design and Implementation (OSDI ' 04). San Francisco, California, USA, 2004: 137-150.
  • 8DeWitt D J, Gerber R H, Graefe G, Heytens M L, Kumar K B, Muralikrishna M. GAMMA--A high performance dataflow database machine//Proceedings of the 12th International Conference on Very Large Data Bases (VLDB' 86). Kyoto, Japan, 1986:228-237.
  • 9Fushimi S, Kitsuregawa M, Tanaka H. An overview of the system software of a parallel relational database machine// Proceedings of the 12th International Conference on Very Large DataBases(VLDB'86). Kyoto, Japan, 1986:209-219.
  • 10Brewer E A. Towards robust distributed systems//Proceedings of the 19th Annual ACM Symposium on Principles of Distributed Computing (PODC' 00). Portland, Oregon, USA, 2000:7.

共引文献698

同被引文献33

引证文献4

二级引证文献7

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部