期刊文献+

基于LDA主题模型的情感分析研究 被引量:5

Research of Emotional Analysis Based on LDA Topic Model
下载PDF
导出
摘要 LDA主题模型在提取特征时缺乏对词语关联及相关词对的理解,这会影响情感极性分类的准确率。针对这一问题,文中提出一种在LDA主题模型中引入特征情感词对抽取方法的新模型,以改善特征情感词对的抽取效果。利用依存句法分析设计特征情感词对的识别方法,随后将识别方法作为约束条件引入LDA模型对特征情感词对进行抽取。通过吉布斯采样进行参数计算,给出了模型的生成过程。最后利用随机森林分类方法对文本进行情感极性分类。为验证文中模型的有效性,将其和另外两种模型一起进行实验,当主题个数为20时,文中所提模型分类的准确率、召回率、F值分别为81.54%、83.13%和82.33%,显著高于另外两种模型。 LDA topic model lacks understanding of word association and related word pairs when extracting features,which affects the precision of emotional polarity classification.Aiming at this problem,this paper proposed a new model to introduce the feature-opinion pair extraction method in the LDA topic model to improve the extraction effect of the feature opinion pairs.Dependency parsing was used to design feature affective word pairs recognition methods of characteristic affective word pairs.Then the recognition method was introduced as a constraint condition into the LDA model to extract the feature sentiment word pairs.The parameters were calculated by Gibbs sampling,and the generation process of the model was proposed.Finally,the emotional polarity of the text was classified using the random forest classification method.In order to verify the validity of the proposed model,the experiment was carried out together with the other two models.When the number of subject was 20,the results showed that the precision,recall and F-Measure were 81.54%、83.13% and 82.33%,which were significantly higher than the other two models.
作者 刘艳文 魏赟 LIU Yanwen;WEI Yun(School of Optical-Electrical and Computer Engineering,University of Shanghai for Science and Technology,Shanghai 20009,China)
出处 《电子科技》 2020年第7期12-16,26,共6页 Electronic Science and Technology
基金 国家自然科学基金(1170277,61472256) 上海市科委科研计划项目(16111107502)。
关键词 产品评论 情感分析 依存句法 特征抽取 LDA主题模型 随机森林算法 product reviews sentiment analysis dependency syntax feature extraction LDA topic model random forest algorithm
  • 相关文献

参考文献12

二级参考文献117

  • 1张桂宾.相对程度副词与绝对程度副词[J].华东师范大学学报(哲学社会科学版),1997,29(2):92-96. 被引量:78
  • 2朱嫣岚,闵锦,周雅倩,黄萱菁,吴立德.基于HowNet的词汇语义倾向计算[J].中文信息学报,2006,20(1):14-20. 被引量:326
  • 3耿焕同,蔡庆生,于琨,赵鹏.一种基于词共现图的文档主题词自动抽取方法[J].南京大学学报(自然科学版),2006,42(2):156-162. 被引量:30
  • 4杜小勇,李曼,王珊.本体学习研究综述[J].软件学报,2006,17(9):1837-1847. 被引量:241
  • 5Deerwester S, Dumais S, Furnas G W, et al. Indexing by Latent Semantic Analysis[J]. Journal of the American Society for Information Science, 1990, 41(6): 391-407.
  • 6Hofmann T. Prnbabilistie Latent Semantic Indexing [C]. In: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, California, United States. New York: ACM, 1999: 50-57.
  • 7Blei D M, Ng A Y, Jordan M I. Latent Dirichlet Allocation[J]. Journal of Machine Learning Research, 2003, 3: 993-1022.
  • 8Phan X, Nguyen M, Horiguchi S. Learning to Classify Short and Sparse Text & Web with Hidden Topics from Large-scale Data Collections [C]. In: Proceedings of the 17th Conference on World Wide Web. New York: ACM, 2008: 91-100.
  • 9Dempster A P, Laird N M, Rubin D B. Maximum Likelihood from Incomplete Data via the EM Algorithm[J]. Journal of the Royal Statistical Society, 1977, 39(1): 1-38.
  • 10Griffiths T L, Steyvers M. Finding Scientific Topics[J].PNAS, 2004, 101(SI): 5228-5235.

共引文献117

同被引文献48

引证文献5

二级引证文献7

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部