期刊文献+

基于层次化的微博情绪分类——以新浪微博为例 被引量:1

Layering-based micro-blog emotion classification——Case study of Sina micro-blog
下载PDF
导出
摘要 针对当前大多微博情绪分析算法难以准确描绘不同情绪差异的问题,对中文微博的情绪成分和层次化情绪分类进行研究。预处理消除非情绪信息,引入ICTCLAS分词工具包对文章进行分割,提取形容词、名词和动词等,形成特征,使用卡方测试、词频和点互信息(PMI)对特征进行选择,运用支持向量回归(SVR)和规则集进行分类。数据集采用新浪原始中文微博,不同分组之间的实验结果验证了该方法的有效性,其在多个层次上的F测度等值优于其它同类方法,随机挑选50篇微博进行评判,近一半的结果得到所有评判员的支持。 Aiming at the problem that lots of current micro-blog sentiment analysis algorithms are difficult to accurately depict the different emotional differences,a study on hierarchical sentiment classification and emotional components of Chinese micro-blog articles was researched.Non-emotional information was eliminated in the pre-processing.The ICTCLAS word segmentation toolkit was introduced to segment the text,to extract adjectives,nouns and verbs,and form features.The x 2-test,word frequency and point of mutual information(PMI)were adopted to select features.Support vector regression(SVR)and rule sets were used for classification.In the experiment,Sina original Chinese micro-blog was used as data sets.The effectiveness of the proposed method is verified by the results of different groups.Compared with other similar methods,the proposed method is more accurate at multiple levels in the aspects of F measure.50 micro-blog are randomly selected to judge.Nearly half of the results are supported by all the judges.
作者 王向华 宋欣 WANG Xiang-hua;SONG Xin(College of Electronic Information Engineering,Tianjin Vocational Institute,Tianjin 300410,China)
出处 《计算机工程与设计》 北大核心 2018年第11期3431-3437,共7页 Computer Engineering and Design
基金 天津市基础研究计划基金项目(14JCTPJC00553) 天津市高等学校科技发展基金计划基金项目(20130711)
关键词 微博 情绪分类 点互信息 情绪成分分析 支持向量回归 micro-blog emotion classification point of mutual information emotional components analysis support vector regression
  • 相关文献

参考文献6

二级参考文献69

  • 1许静芳,李星,李粤.信息检索中主题式词典的构建方法[J].计算机工程,2005,31(21):143-145. 被引量:5
  • 2林传鼎,无.社会主义心理学中的情绪问题——在中国社会心理学研究会成立大会上的报告(摘要)[J].社会心理科学,2006,21(1):37-37. 被引量:15
  • 3Xu G E,Meng Xin-fan,Wang Hou-feng.BuildChinese emotionlexicons using a graph based algorithm and multiple resources[C]//Proceedings of the 23rd International Conference on ComputationalLinguistics.Stroudsburg,PA:Association for Computational Linguistics,2010:1209-1217.
  • 4Kim S M,Hovy E.Identifying and analyzing judgment opinions[C]//Proceedings of the Main Conference on Human LanguageTechnology Conference of the North American Chapter of the AssociationofComputational Linguistics.Stroudsburg,PA:Association for Computational Linguistics,2006:200-207.
  • 5Kim S M,Hovy E.Automatic detection of opinion bearing word sand sentences[C]// Proceedings of the Second International JointConference on Natural Language Processing.JejuIsland:[s.n],2005:61-66.
  • 6Hatzivassiloglou V,Mckeown K.Predicting the semanticorientation of adjectives[C]//ACL 97:Proceedings of the 35thAnnual Meeting of the Association for Computational Linguistics.Madrid,Spain:[s.n],1997:174-181.
  • 7Velikovich L,Blair-Goldensohn S,Hannan K,et al.The viability of Web derived polarity lexicons[C]//Proceedings ofthe North American Chapter of the Association for Computational Linguistics.Stroudsburg,PA:Association for Computational Linguistics,2010:777-785.
  • 8Turney P,Littman M L.Measuring praise and criticism:Inferen ceof semantic orientation from association[J].ACM Transactionson Information Systems,2003,21(4):315-346.
  • 9Brin S,Page L,Motwami R,et al.The PageRank Citation Ranting:Bringing Order to the Web[R].Stanford:Stanford Univer sity,1999.
  • 10Tsoumakas G, Katakis 1. Multi-label Classification: An Overview. InternationalJournal of Data Warehousing and Mining, 2007, 3(3) :1-13.

共引文献51

同被引文献15

引证文献1

二级引证文献9

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部