摘要
针对网络短文本情感挖掘问题,提出一种新的基于LDA和互联网短评行为理论的主题情感混合模型TSCM,TSCM模型中的整篇评论中每个句子的主题分布是不同的,TSCM产生词的流程是先确定词的情感极性,再确定词的主题,TSCM考虑了词与词之间的联系.真实数据集Movie与Amazon上的大量实验表明,与代表性算法JST、SLDA、D-PLDA和SAS相比较,TSCM模型能对用户真实情感与讨论主题进行更加有效的分析建模.
For sentiment analysis of web short texts,a topic sentiment combining model (TSCM)is proposed based on LDA and web review behavioral theory,which is founded on the assumption that topic distribution of each sentence in a review is unique and different from that of other sentences.Generative process of TSCM is to first determine sentiment orien-tation of each word and then topic of each sentence in a review while taking word relation into consideration.Extensive ex-periments on real-world datasets (Movie and Amazon)show that TSCM significantly outperforms JST,S-LDA,D-PLDA and SAS in terms of the accuracy of sentiment classification and topic detection.
出处
《电子学报》
EI
CAS
CSCD
北大核心
2016年第8期1887-1891,共5页
Acta Electronica Sinica
基金
国家自然科学基金(No.61370078
No.61363037)
教育部人文社会科学研究青年基金项目(No.12YJCZH074)
福建省教育厅科技项目(No.JA13077)
关键词
情感分析
主题情感混合模型
LDA
sentiment analysis
topic sentiment mixture
latent dirichlet allocation (LDA)