摘要
话题演化可以帮助人们快速获取信息和了解趋势.提出了一种挖掘话题随时间变化的方法,通过话题抽取和话题关联实现话题的演化.对不同时间段的文集进行话题的自动抽取,话题数目在不同时间段是可变的;计算相邻时间段中任意2个话题的分布距离和话题的特征向量相似度实现话题的关联.实验结果证明,该方法不但可以描述同一个话题随时间的强度变化,还可以描述新话题的产生,旧话题的消失以及话题内容随时间的演化.
Topic evolution will help people to learn information quickly.In this paper,a method was proposed to discover topic's evolution over time by topic detection and relating topics in different time periods.The method applies LDA model on temporal documents to extract topics.The number of topics in different time periods is different.Relating topics in consecutive time periods is based on Jensen-Shannon divergence and features similarity.Experiments show that the method can detect new topics and describe topic's evolution over time effectively.It not only shows that the topics evolve with time,but also that the content of topics change with time.
出处
《上海交通大学学报》
EI
CAS
CSCD
北大核心
2010年第11期1496-1500,共5页
Journal of Shanghai Jiaotong University
基金
国家自然科学基金资助项目(60873134)
关键词
话题探测
话题关联
话题演化
潜在狄里特里分配
topic detection
topic association
topic evolution
latent Dirichlet allocation(LDA)