摘要
针对影响主题检测性能的2个重要因素——相似主题的判定和主题漂移问题,提出一种基于自适应重心向量的主题检测方法。该方法将命名实体信息应用到特征表示上,将命名实体向量和关键词向量相结合表示主题的重心向量,以有效区分相似主题。采用增量聚类检测主题,在增量聚类过程中不断修正主题重心,以解决主题漂移的问题。实验结果与性能比较表明,该方法能有效提高主题检测的性能。
Similar topic detection and topic excursion are two important factors which affect the performance of topic detection. For these two problems, this paper proposes a topic detection approach based on adaptive center vector. By using information of name-entity in feature representation, it combines name-entity vector and keyword vector to construct topic center vector, which can detect similar topic efficiently. Based on the idea of single-pass clustering, the algorithm modifies topic center dynamically. Experimental results show that the algorithm can improve the performance of topic detection effectively.
出处
《计算机工程》
CAS
CSCD
北大核心
2009年第3期80-82,共3页
Computer Engineering
基金
国家"863"计划基金资助项目(2007AA01Z439)