期刊文献+

基于Gaussian LDA与谱聚类融合的代表性负向评论提取 被引量:5

Representative Negative Comments Extraction Based on Fusion of Gaussian LDA and Spectral Clustering
原文传递
导出
摘要 【目的/意义】在线评论尤其是负向评论是消费者进行购买决策的重要依据。而现有减少信息冗余方法在负向在线评论中表现还有待提高。【方法/过程】文中提出了一种基于Gaussian LDA的负向评论谱聚类方法。首先,利用Gaussian LDA模型获取负向评论中的主题分布,然后通过主题分布来计算评论间的皮尔森相似度,并应用谱聚类算法实现负向评论聚类,最后提取每类距离簇中心最近的m条评论作为该类的代表性评论。【结果/结论】通过将Gaussian LDA、LDA、TF-IDF和Doc2Vec分别与谱聚类结合,以及将Gaussian LDA与K-means、DBSCAN、谱聚类结合进行交叉比较,验证了所提方法的优越性。据此提取的负向评论类别间区分度高,具有高度代表性,较好地解决了信息冗余问题。【创新/局限】先提取主题再进行聚类的多模型集成式聚类方法为解决评论信息冗余问题提供了新的方法和思路,也为研究文本挖掘、文本聚类提供了一种新的参考。 【Purpose/significance】Online reviews, especially negative reviews, are an important basis for consumers to make purchasing decisions. However, the existing methods for reducing information redundancy still need to be improved in negative online reviews.【Method/process】This paper proposes a clustering method of negative review spectrum based on Gaussian LDA. First, the Gaussian LDA model is used to obtain the topic distribution in negative reviews, then the Pearson similarity between reviews is calculated based on the topic distribution, and the negative clustering is implemented using a spectral clustering algorithm. Finally, each class is nearest to the cluster center M comments are representative of this category.【Result/conclusion】Gaussian LDA, LDA, TF-IDF and Doc2 Vec were combined with spectral clustering respectively, and cross-comparison of Gaussian LDA with K-means, DBSCAN, and spectral clustering was performed to verify the superiority of the method Sex. The negative comments extracted based on this are highly differentiated and highly representative, which solves the problem of information redundancy.【Innovation/limitation】The multi-model fusion clustering method that firstly extracting topics and then clustering provides a new method and ideas for solving the problem of review information redundancy. It also provides a new reference for the study of text mining and text clustering.
作者 吴银昊 那日萨 李慧 WU Yin-hao;ZHAO Narisa;LI Hui(Institute of Systems Engineering,Dalian University of Technology,Dalian 116024,China)
出处 《情报科学》 CSSCI 北大核心 2021年第3期136-142,共7页 Information Science
基金 国家自然科学基金面上项目“基于在线评论的网络消费者群体行为预测智能技术研究”(61471083) 教育部人文社科研究规划基金资助项目“基于在线评论的网络消费者群体行为机理及预测”(14YJA630044) 大连市科技创新基金项目“大连智慧城市建设中基于大数据的智能决策理论方法及支持技术研究”(2018J11CY009)。
关键词 Gausssian LDA 主题模型 谱聚类 负向评论 聚类模型 Gaussian LDA topic model spectral clustering negative comment clustering model
  • 相关文献

参考文献8

二级参考文献128

  • 1邹娟,周经野,邓成,高南莎.特征词提取中同义处理的新方法[J].中文信息学报,2005,19(6):44-49. 被引量:10
  • 2李洁,高新波,焦李成.基于特征加权的模糊聚类新算法[J].电子学报,2006,34(1):89-92. 被引量:114
  • 3郭国庆,杨学成,张杨.口碑传播对消费者态度的影响:一个理论模型[J].管理评论,2007,19(3):20-26. 被引量:100
  • 4刘群,李素建.基于《知网》的词汇语义相似度的计算[EB/OL].[2011-02-15].http://wenku.baidu.com/view/b213af9951e79b8968022660.html.
  • 5Jain A, Murty M, Flynn P. Data clustering.. A Review[J]. ACM Computing Surveys, 1999,31 (3) : 264-323.
  • 6Fiedler M. Algebraic connectivity of graphs. Czech, Math. J. , 1973,23: 298-305.
  • 7Malik J,Belongie S,Leung T, et al. Contour and texture analysis for image segmentation In Perceptual Organization for Artificial Vision Systems. Kluwer, 2000.
  • 8Weiss Y. Segmentation using eigenvectors: A unified view//International Conference on Computer Vision 1999.
  • 9Shi J,Malik J. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000,22 (8) : 888-905.
  • 10Wu Z, Leahy R. An optimal graph theoretic approach to data clustering: theory and its application to image segmentation [J]. IEEE Trans on PAMI,1993, 15(11):1101-1113.

共引文献1416

同被引文献112

引证文献5

二级引证文献36

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部