基于情感词典与语义规则集的微博文本情感分析

Sentiment Analysis of Microblog Text Based on Sentiment Dictionary and Semantic Rule Set

下载PDF

导出

摘要近年来,以微博为代表的中文媒体平台正在不断融入人们的生活,人们每天都在这些平台上发表自己的观点、感受等其他主观信息,如何从这些信息中提取有价值的情感信息并加以利用就称作情感分析。本文提出了一种基于情感词典与语义规则集的微博文本情感分析方法。我们的方法将现有的多个基础情感词典结合起来,并基于统计信息的方法构建了微博领域情感词典,同时考虑到中文的语义特性,加入了自定义的语义规则集。为了验证该方法的有效性,我们通过网络爬虫技术获取微博中关于新冠肺炎的评论信息共10万条微博文本,在此数据集上进行了实验。实验结果表明,与传统的基于情感词典的方法相比,我们的方法具有更高的准确性和更稳定的表现,正面、负面和中性情感识别准确率分别达到了79.4%、82.5%、77.3%。综上所述,本文提出的基于情感词典与语义规则集的微博文本情感分析方法具有较高的准确性和泛化能力,能够有效地识别微博文本中的情感,并具有应用价值。 In recent years, Chinese media platforms represented by microblog have been increasingly integrated into people’s lives. People express their views, feelings and other subjective information on these platforms every day. How to extract valuable sentimental information from this information and make use of it is called sentiment analysis. In this paper, a sentiment analysis method based on sentiment dictionary and semantic rule set is proposed. Our method combines several existing basic sentiment dictionaries and constructs a microblog domain sentiment dictionary based on statistical information. At the same time, considering the semantic characteristics of Chinese language, we add a custom semantic rule set. In order to verify the effectiveness of this method, we used web crawler technology to obtain a total of 100,000 microblog comments on COVID-19, and conducted experiments on this data set. The experimental results show that compared with the traditional sentiment dictionary-based method, our method has higher accuracy and more stable performance, and the accuracy rate of positive, negative and neutral sentiment recognition reaches 79.4%, 82.5% and 77.3%, respectively. In conclusion, the sentiment analysis method based on sentiment diction-ary and semantic rule set proposed in this paper has high accuracy and generalization ability, and can effectively identify the sentiment in microblog, and has application value.

作者王伟贤吴俊

机构地区扬州大学信息工程学院

出处《计算机科学与应用》 2023年第4期754-763,共10页 Computer Science and Application

关键词情感分析微博文本情感词典规则集

分类号 G63 [文化科学—教育学]

引文网络
相关文献

参考文献9

1高雅,苏艳,席方园.基于Python的新浪微博用户数据采集与分析[J].电子设计工程,2019,27(20):157-160. 被引量：9
2朱嫣岚,闵锦,周雅倩,黄萱菁,吴立德.基于HowNet的词汇语义倾向计算[J].中文信息学报,2006,20(1):14-20. 被引量：326
3王素格,李德玉,魏英杰,宋晓雷.基于同义词的词汇情感倾向判别方法[J].中文信息学报,2009,23(5):68-74. 被引量：36
4刘龙飞,杨亮,张绍武,林鸿飞.基于卷积神经网络的微博情感倾向性分析[J].中文信息学报,2015,29(6):159-165. 被引量：96
5姜杰,夏睿.机器学习与语义规则融合的微博情感分类方法[J].北京大学学报（自然科学版）,2017,53(2):247-254. 被引量：22
6王婷,杨文忠.文本情感分析方法研究综述[J].计算机工程与应用,2021,57(12):11-24. 被引量：111
7吴杰胜,陆奎.基于多部情感词典和规则集的中文微博情感分析研究[J].计算机应用与软件,2019,36(9):93-99. 被引量：36
8曹东伟,李邵梅,陈鸿昶,张建朋,张桥.融合情感特征的虚假评论检测方法[J].信息工程大学学报,2021,22(3):326-330. 被引量：3
9徐康庭,宋威.结合语言知识和深度学习的中文文本情感分析方法[J].大数据,2022,8(3):115-127. 被引量：4

二级参考文献105

1朱嫣岚,闵锦,周雅倩,黄萱菁,吴立德.基于HowNet的词汇语义倾向计算[J].中文信息学报,2006,20(1):14-20. 被引量：326
2徐琳宏,林鸿飞,杨志豪.基于语义理解的文本倾向性识别机制[J].中文信息学报,2007,21(1):96-100. 被引量：123
3王根,赵军.中文褒贬义词语倾向性的分析[C].第三届学生计算语言学研讨会论集,2006:81-85.
4PETER D.Turney.Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews[C]//Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL)//Philadelphia,PA,USA.2002; 417-424.
5PETER D.Turney and MICHAEL L.Littman.Measuring praise and criticism:inference of semantic orientation from association[J].ACM Transactions on Information Systems,2003,21(4):315-346.
6PETER D.Turney and MICHAEL L.Littman.Unsupervised learning of semantic orientation from a hundred-billion-word corpus[R].Tech.Rep.EGB-1094,National Research Council Canada:2002.
7DAVE K.,LAWRENCE S.,and PENNOCK D..Mining the peanut gallery.,opinion extraction and semantic classification of product reviews[C]//Proceedings of the 22nd International World Wide Web Conference.Budapest,Hungary:2003.
8YUEN Raymond W.M.,CHAN Terence Y.W.,LAI Tom B.Y.et al.Morpheme-based derivation of bipolar semantic orientation of Chinese words[C]//Proc.Of the 20th International Conference on Computational Linguistics (COLING-2004),Geneva,Switzerland.2004:1008-1014.
9Vasileios Hatzivassiloglou, Kathleen R. McKeown. Predicting the semantic orientation of adjectives[A]. In: Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and the 8th Conference of the European Chapter of the ACL[C], 1997:174- 181.
10Turney, Peter, Littman Michael. Measuring praise and criticism: Inference of semantic orientation from association[J]. ACM Transactions on Information Systems, 2003, 21(4): 315- 346.

共引文献596

1张昉.碰瓷营销在社交媒体平台收割“免费流量”的策略运用——以周杰伦打榜事件为例[J].新闻传播,2020(24):28-29.
2王君泽,詹若贤,李怡,杜洪涛.融合主题与细粒度情感特征的气候变化微博舆情分析研究[J].信息技术与管理应用,2023(4):87-104. 被引量：1
3江进德,张玉可.皖北乡村旅游的情感特征及其影响因素分析[J].商丘师范学院学报,2023,39(6):64-69.
4杨一璞,朱永华,高海燕,高文靖.一种结合文章信息的新闻评论情感分析方法[J].上海大学学报（自然科学版）,2022,28(1):170-178. 被引量：1
5刘建友.基于神经网络的搜索引擎模型构建研究[J].网络安全技术与应用,2020,0(2):39-40.
6杜家驹,岂凡超,孙茂松,刘知远.基于局部语义相关性的定义文本义原预测[J].中文信息学报,2020(5):1-9. 被引量：4
7任星泽,唐燊,侯牧天,冯明懿,鲁琪,冉俐雯.基于LDA主题模型的知乎平台婚恋话题情感分析[J].心理学通讯,2024,7(1):57-65.
8陈清化,薛书琦,龚壮壮,曹润康.基于文本挖掘的物流服务水平评价方法[J].计算机应用,2023,43(S01):88-94. 被引量：1
9吴安,轩福华.基于网络文本的景区旅游形象感知研究——以亚龙湾热带天堂森林公园为例[J].对外经贸,2023(1):68-72.
10梁兆东,朱土凤,安旭东,李崇洁,廖冬梅,周琦杰.基于信息化技术实现2次地震事件评论数据的挖掘分析[J].华北地震科学,2020,38(S02):144-151.

1朱宇雷,德吉卡卓,群诺,尼玛扎西.基于图神经网络结合预训练模型的藏文短文本情感分析研究[J].中文信息学报,2023,37(2):71-79. 被引量：4
2黄智濒,傅广涛,曹凌婧,刘小萌,禹旻,杨武兵.基于多视图聚类算法的三维流场关键点附近的流线筛选[J].计算机辅助设计与图形学学报,2022,34(12):1930-1942.
3殷双斌,周林,张鹏,任正瀚.基于云模型的装备维修能力需求相似度评估[J].装甲兵学报,2022(6):50-56.
4吴锦程.挖掘文本中的独特情感——以《秋天的怀念》为例[J].语文教学之友,2023,42(4):18-20.
5鲍文霞,孙强,梁栋,胡根生,杨先军.结合感受野模块与并联RPN网络的火焰检测[J].中国图象图形学报,2023,28(2):418-429. 被引量：4
6杜丽萍,刘瑞霞,古想花,丰锐,高远,李中伟.河南省10地市医务人员信息素养现况调查[J].现代医药卫生,2023,39(5):823-828.
7季风.模范偶像[J].足球周刊,2022(8):98-101.
8姚思如.张爱玲译本的《老人与海》情感传递程度探究[J].现代语言学,2023,11(4):1465-1471.
9蔡阳.面向数据缺失的对抗生成文本分类方法研究[J].中国科技期刊数据库工业A,2021(12):134-135.
10王茜.关于初中英语阅读语篇中语法教学的策略研究[J].吉林教育,2023(9):67-69.

计算机科学与应用

2023年第4期

浏览历史

内容加载中请稍等...

基于情感词典与语义规则集的微博文本情感分析

参考文献9

二级参考文献105

共引文献596

相关作者

相关机构

相关主题

浏览历史