期刊文献+

融合Sentence-BERT和LDA的评论文本主题识别 被引量:4

Topic Recognition of Comment Text Based on Sense Bert and LDA
下载PDF
导出
摘要 [目的/意义]为了解决评论文本主题识别时语义描述不充分以及学习到的主题语义连贯性不强等问题。本文将Sentence-BERT句子嵌入模型和LDA模型相结合,提升评论文本主题的语义性。[方法/过程]采用Sentence-BERT模型获取评论文本句子层面的向量特征,同时,采用LDA模型获取评论文本的概率主题向量,随后使用自动编码器连接两组向量,运用K-means算法对潜在空间向量进行聚类,从类簇中获取上下文主题信息。[结果/结论]通过对评论文本数据集的实验,本文方法可以较好地获得具有语义信息的主题词。Sentence-BERT模型与LDA结合,增加了模型的复杂性。通过对比,本文方法获得的主题一致性指标(Coherence)优于目前常见的评论文本主题识别方法。 [Purpose/Significance]In order to solve the problems of insufficient semantic description and weak semantic coherence of the learned topic in the comment text.This paper combines sentence embedding model of sentence Bert with LDA model to improve the semantics of the topic of comment text.[Method/Process]The sentence Bert model was used to obtain the vector features of the comment text at the sentence level.At the same time,the LDA model was used to obtain the probability topic vector of the comment text.Then the automatic encoder was used to connect the two groups of vectors,and the K-means algorithm was used to cluster the potential spatial vectors to obtain the contextual topic information from the cluster.[Result/Conclusion]Through the experiment of comment text,this method can better obtain the subject words with semantic information.Through comparison,the topic consistency index obtained by this method is better than the current comment text topic recognition methods.
作者 阮光册 黄韵莹 Ruan Guangce;Huang Yunying(Department of Information Management,East China Normal University,Shanghai 200062,China)
出处 《现代情报》 2023年第5期46-53,共8页 Journal of Modern Information
关键词 Sentence-BERT LDA模型 评论文本 主题识别 Sentence-Bert LDA model comment text topic identification
  • 相关文献

参考文献16

二级参考文献153

共引文献243

同被引文献56

引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部