摘要
指出文本内容主题的挖掘和演化研究对于文本建模和分类及推荐效果提升具有重要作用。从分析基于LDA主题模型的文本内容主题挖掘原理入手,针对当前网络环境下的文本内容特点,构建适用于动态文内容本主题挖掘的LDA模型,并通过改进的Gibbs抽样估计提高主题挖掘的准确性,进而从主题相似度和强度两个方面研究内容主题随时间的演化问题。实验表明,所提方法可行且有效,对后续有关文本语义建模和分类研究等具有重要的实践意义。
The study of mining and evolution of text topics is of important significance for text modeling and classification, as well as the recommendation service. Starting from the analysis of theory of text topic modeling based on LDA, aiming at dynamic characters of text contents under social networking environment, this article constructed a dynamic LDA model for mining of text topics. Subsequently, the accuracy degree of topic mining was improved by incremental Gibbs sampling and estimation. Furthermore, the evolution of dynamic topics of text contents was achieved from the aspects of topic similarity and intensity. The experiment demonstrated that methods proposed in this article were feasible and effective, which will be the foundation of further study about semantic modeling and classification text.
出处
《图书情报工作》
CSSCI
北大核心
2014年第2期138-142,共5页
Library and Information Service
基金
教育部人文社会科学青年基金项目"社会网络环境下信息内容主题挖掘与语义分类研究"(项目编号:13YJC870008)
国家自然科学青年基金项目"社会网络环境下基于用户-资源关联的信息推荐研究(项目编号:71303178)"研究成果之一